BREAKING THE DUALITY IN THE RETURN TIMES THEOREM 
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Abstract. We prove Bourgain's Return Times Theorem for a range of exponents p 
and q that are outside the duality range. An oscillation result is used to prove hitherto 
unknown almost everywhere convergence for the signed average analog of Bourgain's 
averages. 



1. Introduction 

Almost everywhere convergence results for ergodic weighted averages of various kinds 
typically are proved in two steps: first one proves convergence for a small class of functions 
(typically L°°) and then one proves a priori bounds for maximal operators which allow to 
extend the almost everywhere convergence result to larger classes of functions (typically 
L p ). In many instances, both steps can require rather sophisticated analysis and offer 
their own challenges. As the exponent p is lowered, it gets increasingly harder to prove 
L p bounds for maximal operators and there may be several thresholds at which certain 
methods break down. It has been recognized in [22] that time frequency methods as 
pioneered in [14], [19], [23], [24] give the strongest maximal theorems known to date for 
the operators that they apply to. The purpose of the current paper is to make time 
frequency methods available for a much wider class of ergodic averages that have enjoyed 
some prominence in ergodic theory in recent history. In particular we are able to break 
the threshold of exponents in duality in Bourgain's Return Times Theorem [9], [10]. 
The methods in this paper are rather robust and typically apply not only to standard 
averages but for example to signed and weighted averages with Hilbert kernels as weights. 
Moreover, the method typically provides a priori estimates for oscillation norms along 
with a priori estimates for maximal operators, and thus abandons the need to prove 
convergence for dense subclass (L°°) along with bounds for maximal operators. In this 
paper an oscillation result is used to prove hitherto unknown convergence for the signed 
average analog of Bourgain's Return Times Theorem, and to provide a separate proof 
of Bourgain's theorem. As in earlier works such as [17], [18] and [25], our methods are 
almost entirely analytic in nature, however the results have independent interest from 
both an ergodic theoretic and harmonic analytic point of view. 

Let X = (X, S, p, t) be a dynamical system, that is a Lebesgue space (X, S, p) equipped 
with an invertible bimeasurable measure preserving transformation r : X —>■ X. We recall 
that a complete probability space (X, E, p) is called a Lebesgue space if it is isomorphic 
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with the ordinary Lebesgue measure space ([0, 1), £, m), where £ and m denote the usual 
Lebesgue algebra and measure (see [21] for more on this topic). In particular, the a- 
algebra E (and hence all the spaces L P (X)) will be separable, a property that will be used 
later to argue that a certain class of operators act measurably. The system X is called 
ergodic if A G £ and /i(A A r _1 A) = imply fj,(A) G {0, 1}. 
In [9], [10] (see also [12]) Bourgain proved the following result. 

Theorem 1.1 (Return times theorem, [9], [10]). For each function f G L°°(X) there 
is a universal set X C X with /i(X ) = 1, such that for each second dynamical system 
Y = (Y, JF, v, a), each g G L°°{Y) and each x G X , the averages 



converge v-almost everywhere. 

If in the above theorem / is taken to be a constant function, one recovers the classical 
Birkhoff's pointwise ergodic theorem, see [8]. However, Theorem 1.1 is much stronger, in 
that it shows that given /, almost every sampling sequence (/(r n a;)) ne N forms a system 
of universal weights for the pointwise ergodic theorem. 

Interest in results like Theorem 1.1 can be traced back to the result of Wiener and 
Wintner [31], whose equivalent formulation is that for each integrable function /, almost 
every sampling sequence (f(r n x)) ne ^ is a universal system of weights for the mean ergodic 
theorem: 

Theorem 1.2 (Wiener- Wintner theorem, [31]). For each function f G L X (X) there is a 
universal set X C X with fi(X ) = 1, such that for each 9 G [0, 1) and each x G Xq the 
following averages converge 



This result also is an immediate consequence of Theorem 1.1. Indeed, for each 9 G [0, 1) 
we can apply the theorem to the system Y consisting of the interval [0, 1) equipped with 
the Lebesgue algebra and measure, together with the transformation ay := y + 9 (mod 1), 
and to the function g(y) := e 2my . 

An alternative proof of Theorem 1.1, based on the machinery of joinings, is due to 
Rudolph [28]. The same author refines his techniques in [29] to prove a deep multiple 
return times theorem. Holder's inequality and an elementary density argument show that 
Bourgain's theorem holds for / G L P (X) and g G L q (Y), whenever 1 < p, q < oo and 
i + 1 < 1, see [28] and also Section 4 here. On the other hand, it has been recently proved 
by Assani, Buczolich and Mauldin [6] that this result fails when p = q = 1: 

Theorem 1.3. [6] Let X = (X, E,/x, r) be an ergodic dynamical system. There exist a 
function f G LM^X) and a subset X C X of full measure with the following property: 
for each x G X and for each ergodic dynamical system Y = (Y, J 7 , v, a), there exists 
g G L^iY) such that the averages 
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diverge for almost every y. 

The need for ergodicity in the above theorem is apparent from the observation that if 
either r or a is the (nonergodic) identity transformation, then a positive result is easily 
seen to hold instead, for all integrable functions / and g. 

An interesting question arises on whether Theorem 1.1 holds outside the duality range: 

Question 1.4. Do there exist indices 1 < p, q < oo with ^ + ^ > 1 such that for each 
dynamical system X = (X, S, //, r) and each f G L P (X) there is a universal set Xq C X 
with h(Xq) = 1, such that for each second dynamical system Y = (Y, J 7 , u, a) , each 
g G L q {Y) and each x G X , the averages 

-J2f^)g(v n y) (i) 



N 

n=0 



converge v -almost everywhere'? 



Supporting evidence for a positive result in this direction comes from the fact that 
the duality is indeed broken if either the weights or the test process is replaced with a 
sequence of i.i.d. random variables: 

Theorem 1.5 (I. Assani 2003, [3], [4]). Let (X n ) be a sequence of i.i.d. random variables 
with finite p th moment for some 1 < p < oo, defined on the probability space (X, S,/i). 
Then there exists a subset X* C X of full measure such that for each x G X* the following 
holds: for any dynamical system Y = (Y, JF, u, a) and g G L q {Y), 1 < q < oo, we have 



i N ^ r 

\im-Y,X n {x)g{a n y) = E(X ) / gdv 

» n ^ 



n=0 

for v -almost every y. 

Theorem 1.6 (I. Assani 1997, [2]; J. Baxter, R. Jones, M. Lin, J. Olsen 2003, [7]). 
Assume that either p > 1 and q — 1, or p = 1 and q > 1. For each dynamical system 
X = (X, E, /i, r) and each f G L P (X) there is a set X* C X of full measure, such that for 
each sequence of LP i.i.d. random variables Y n defined on the probability space (Y, T, v) 
and each x G X* , 



N^oo N 

n=0 



exists for v-almost every y. 



Similar questions arise in the case of summation operators. We recall that the almost 
everywhere convergence of the ergodic truncated Hilbert transform 

lim f ' (2) 

n=-N 

was proved by Cotlar [16]. The return times results for series are harder; the spectral 
theory and dynamics methods seem to be inapplicable to address the following question 
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Question 1.7. Given 1 < p and 1 < q < oo, is it true that for each dynamical system 
X = (X, £,//, r) and each function f G L P (X) ; i/iere zs a universal set X C X wjt/i 
/x(X ) = 1, suc/i t/iat /or eac/i second dynamical system Y = (Y, J 7 , v, a), each g G £ 9 (^) 
and eac/i x G X 0; £/ie series 

N 

Km y ' /(^)g( gW y) ( 3) 

71= -JV 

converges v -almost everywhere 1 ? 

It has been shown in [1] that Question 1.7 has a negative answer when p — 1, for 
arbitrary g. Positive results are again known outside the duality range, in the special 
case when either the weights or the test process consist of i.i.d. random variables, see 
[5]. However, no positive results were known for Question 1.7 prior to this work, not even 
when p = q = oo. We note that unlike the case of the averages, Holder's inequality is of 
no use here due to the lack of summability of the sequence (-)neisr- 

We close this discussion with a parallel between return times results for averages and 
series. Spectral theory is an important component of all the four known proofs of The- 
orem 1.1. Three of them use purely dynamical (in particular non- Fourier- analytical) 
methods and rely on the spectral decomposition according to which each function can be 
decomposed into a component with a purely discrete spectral measure plus a component 
with continuous spectral measure. If f\ and g± represent the continuous components of / 
and g while f 2 and g 2 are the discrete components, then the proof shows that the limit of 
the averages 



N-l 



n=0 

is 0, as long as 1 G {i,j}- That is to say, the Kronecker factor (i.e the sub a algebra 
spanned by the eigenf unctions of the transformation) is characteristic for the (almost 
everywhere and norm) convergence of these averages. 

This type of spectral analysis has not proven successful so far in proving convergence 
results for Hilbert series like the ones in Question 1.7. The Kronecker factor is not expected 
to play the same role as in the case of averages. In particular, not even the series in (2) will 
converge to zero for all functions with continuous spectrum. This suggests that, perhaps, 
the answer to these questions does not lie in dynamics, but rather in analytic methods. 

In this paper we will answer affirmatively Questions 1.4 and will prove a similar result 
for Question 1.7. These theorems are described in detail in the next section. 

2. Notation and terminology 

If / C R is an interval then c(I) denotes the center of J, |/| denotes the length, and 
CI is the interval with the same center and length C times the length of /. By 1 A we 
denote the characteristic function of the set A C R, while for any interval /, we define 
the weight function 



Xi{x) := 1 + 



\x-c(I)\ 

\T\ 

A tile s is a rectangle s = I s x u s with I s some dyadic interval and u s some interval 
satisfying |7 S | • |u; s | = 1. 
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The notation a < b means that a < cb for some universal constant c, and a ~ b means 
that a < b and b < a. These constants are allowed to depend on the exponents p and 
q. Sometimes we will write < x¥ ( x ) with unspecified M to indicate that this 

inequality holds for all M > 1, with implicit constant depending only on M. Also, for 
each 1 < p < oo we use the p th -power Hardy- Littlewood maximal operator 



M p (/)(:r):=(sup- / \f( x + t^dt)^ 

r>0 r J\t\< r 



and the BMO norm 



||/||bmo(r) := s ™Pjf\ J t 



f-Tji If 



1 



where / ranges over all intervals. 

The Fourier transform of a function / : R — > R is 



/(0 :=W)(0 = / /(^)e- 27r ^, 



while the inverse Fourier transform is 



/(0:=J r_1 (/)(0 = / f{x)e 2 ^ x dx. 



Define the dilation, translation, and modulation operators 

mif{x) := s-V'fix/s), 

Tr y f(x) :=f(x-y), 

Mod e f{x) := e 2m9x f(x). 

Definition 2.1. For each M > 0, let A(M) be some big universal constants, that will 
stay fixed throughout this paper. A function (pi is said to be C- adapted to the interval I if 
for each such 1 M > 

\Mx)\<A{M)C^j- 2 xf{x) 

\j-Ji{x)\<A{M)C^ 2 x¥{x). 

The constant C will vary throughout this paper and will always be specified explicitly. 

Definition 2.2. A function (pi is said to have the mean zero property with respect to a 
frequency c if 



(x)e- ixc = 0. 



1 Actually, our proof will only require these decay bounds for a finite number of M, though the number 
of such M can depend on exponents such as p. 
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3. Main results and high-level overview of the proof 

Our first result here gives an affirmative answer to Question 1.4, by extending Bour- 
gain's Return Times theorem to the range 1 < p < oo and q > 2. 

Theorem 3.1. Let 1 < p < oo and q > 2 be some arbitrary indices. For each function 
f G L P (X) there is a universal set X C X with /J,(X ) = 1, such that for each second 
dynamical system Y = (Y, JF, u, a), each g G L q (Y) and each x G X , the averages 



1 N 



n=0 

converge v-almost everywhere. 

Given the convergence for L°° functions / and g, an approximation argument like in 
Theorem 4.3 will immediately prove the above, once we establish the following maximal 
inequality: 

Theorem 3.2. For each dynamical system X = (X, E,/x,r) ; each 1 < p < oo and each 

feL?(X) 

l N 

|| sup sup || sup |— J2f(T n x)g(a n y)\\\ L 2 {Y) \\ L P {x) < ||/|| L *>(x), (4) 

(Y,r^a) \\g\\ L 2 {Y)=1 n ^ 

where the first supremum in the inequality above is taken over all dynamical systems 

Y = (Y, J 7 , v, a) . Here we have subscripted some of our LP norms to clarify the variable 
being integrated over. 

Remark 3.3. The measurability in both inequality (4) and in inequality (5) from below is 
proved by an application of Conze's principle (Theorem 4.2) and the separability of each 
L 2 (Y), and the reader is referred to the proof of Theorem 4.3 for details. 

Inequality (4) is only new for 1 < p < 2. When p > 2 it is an immediate consequence 
of Holder's inequality and the boundedness of the ergodic maximal function in every LP , 
p > 1. 

The analog of Theorem 3.2 for series also holds: 
Theorem 3.4. For each dynamical system X = (X, E,/z, r), each 1 < p < oo and each 

feD>(x) 

N 

SUp SUp || SUp | > ||| L 2 (y) <\\f\\LP(X), (5) 

{y,t,v,*)\\ 9 \\ lHy)=1 n n= _ N n ^ 

where the first supremum in the inequality above is taken over all dynamical systems 

Y = (Y,F,v,a). ' 

Note that no particular case of the maximal inequality (5) was previously known. It is 
also worth observing the lack of applicability of Holder's inequality in this context. The 
inequalities (4) and (5) are obtained via standard transfer methods from the following 
general result, as explained in the Section 5. 
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Theorem 3.5. Let K : R — > R fre an L 2 kernel satisfying the requirements: 

K gC°°(R\{0}) 

|^(0| <min{l,jl} V^O 

1^(01 <|^min{|al} V^O, „>1. 
T/ien t/ie following inequality holds for each 1 < p < oo 









sup 


sup 


2*/ 


Nli2 (R) =l 


fcGZ 



f(x + y)g(z + y)K(^)dy\ 



L2(R) 



< 



Lp(R)- 



(6) 
(7) 

(8) 



(9) 



L£(R) 



Remark 3.6. Due to the fact that K E L 2 , the quantity 

7^ / f(x + y)g(z + y)K(^)dy 

is defined for each g E L 2 and every x and z, assuming / is an L°° function with bounded 
support. Inequality (9) will be proved with this extra requirement about /, then density 
arguments will provide it with a meaning for all / G LP. It further follows that for each 
x G R the quantity 



sup 

llflli2 (R) =l 



If y 



L?(R) 



is well defined and gives rise to a measurable function of x. 

Remark 3.7. A somewhat similar, yet distinct operator is the following bilinear maximal 
function for which bounds are proved in [22]: 



B*(f,g)(x) = sup 
fcez 



If y 

Tjj; J f(x + y)g(x-y)K(^)dy 



While the functions / and g play a symmetric role in the above, their contribution to the 
return times operator in inequality (9) is significantly different. 

Also, unlike in the case of the bilinear maximal function, the signs of y in the innermost 
expression in the left hand side of (9) have no deep significance at all. More generally, a 
simple scaling-dilation argument shows that inequalities (9) with f(x + ay)g(z + by) are 
all equivalent, for each choice of a, b ^ 0. 

One immediate consequence of the above result is the following. 

Corollary 3.8. For each f G L P (R) we have 



1 f l 

sup || sup — / \f(x + y)g(z + y)\dy\\ L 2 {R) 



ISHl2(R) 



< ||/||lp(R), 1<P< oo. 



L5CR) ( 10 ) 
The corollary is trivial for p > 2, by Holder's inequality. To see how the result for general 
p follows from that of Theorem 3.5, choose K to be some positive Schwartz function and 
note that it suffices to assume that / and g are positive and also to restrict the supremum 
in (10) to dyadic values of t. 

The second corollary is the analog of the first one for singular integrals. 
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Corollary 3.9. For each f G L°°(R) with finite support we have 

f dy 

II sup || sup I / f(x + y)g{z + y) — \\\ L 2 {n) \\ L P {n) <\\f\\ LPin) , 1 < p < oo. 

IIsIIl2(r)= 1 * >0 J\v\>t y (11) 

Note again that the integral above is defined for each g G L? and each x and z, due to 
the kernel ^{\ y \>i} being in L? . Consider a C°°(R) kernel such that K{y) = ^ for |y| > 1. 
The proof of the above corollary follows from the following two observations. On the one 
hand, by using Corollary 3.8 it suffices to prove Corollary 3.9 with K replacing the rough 
kernel ^1{|j/|>i} and with the supremum restricted to dyadic values of t. On the other 
hand, it is an easy exercise to prove that K satisfies the requirements of Theorem 3.5. 

As far as Question 1.7 is concerned, we remark that Theorem 3.4 can not provide any 
answer to it. The reason is that a dense class result is missing. It turns out however that 
the techniques used in Theorem 3.4 can be refined to prove the following analog for series 
of Bourgain's Return Times theorem. 

Theorem 3.10. For each function f G L°°(X) there is a universal set X C X with 
/x(X ) = 1, such that for each second dynamical system Y = (Y, J 7 , u, a), each g G L°°(Y) 
and each x G X , the series 

f(r n x)g(a n y) 



n=-iV 



converges v-almost everywhere. 



Now Theorems 3.4 and 3.10 together with an approximation argument as in Theo- 
rem 4.3 lead to the following corollary. 

Corollary 3.11. Let 1 < p < oo and q > 2 be some arbitrary indices. For each function 
f G L P (X) there is a universal set X C X with fi>(X ) = 1, such that for each second 
dynamical system Y = (Y, JF, u, a), each g G L q {Y) and each x G X , the series 

f[r".r)gio"g) 



„=-« n 



converge v- almost everywhere. 

Remark 3.12. It actually turns out that the same methods can be used to give yet another 
proof 2 of Bourgain's Return Times Theorem 1.1, see Section 5.4. 

Choose Y to be the interval [0, 1) equipped with the Lebesgue algebra and measure 
together with the transformation a{y) := y + 9 (mod 1), while g(y) := e 2my . The above 
corollary applied to the dynamical system Y provides the following Wiener- Wintner result 
for series 



2 The proofs in [12], [28] and [29] use dynamics. Bourgain's original argument [9], [10], uses classical 
Fourier analysis geared towards getting entropy estimates for multipliers. The proof along the techniques 
developed in our paper, while inspired by more recent developements in time- frequency harmonic analysis, 
shares similarities with Bourgain's argument; in particular, a special case of Theorem 8.7 here also played 
a crucial role in Bourgain's original argument. 
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Corollary 3.13. Given 1 < p < oo, for each dynamical system X = (X, £,/i,r) and 
each function f G L P (X) there is a universal set X C X with fi(X Q ) = 1, such that for 
each 9 G [0, 1) and each x G Xq the following series converges 

^ n 

n=~N 

A separate proof of the above result appears also in [25]. The methods used there are 
not strong enough to address the rest of the results obtained in this paper. 

Since in general only quantitative inequalities transfer from harmonic analysis to ergodic 
theory, in order to prove Theorem 3.10 via a transfer argument, the almost everywhere 
convergence needs to be quantified in some way. Our approach relies on proving an 
oscillation inequality, which will be shown to imply 3 almost everywhere convergence in 
Section 5. 4 

Theorem 3.14. Let K : R — > R be an L 2 kernel satisfying (6), (7) and (8). Then for 
each 1 < p < oo there is < e(p) < | such that the following holds: for each d = 2 1 / n , 
jiGN, and for each finite sequence of integers k\ < ki < . . . < kj 



sup 

MIl2(R) =1 



J-1 „ 

[J2 ^p | / f{x + y)giz + y)(Di& K - Dil^ +1 K)(y) dy\ 2 ) 1 ' 2 



i£(R) 



<^- e(p) bl|L,(R), 

with the implicit constants depending only on n and p. 

This theorem is a consequence of two distinct results of dyadic analysis. The first one, 
Theorem 3.15, is the particular case d = 2 of the above and captures the main difficulty of 
the problem. The second one, Theorem 3.16, is a square function estimate and is meant 
to control error terms. 

To understand better the connection between Theorems 3.14, 3.15 and 3.16 we introduce 
some notation. Let h : (0, oo) — > C. Let also k x < . . . < kj be as in Theorem 3.14 and 
define integers ai < . . . < aj such that ajn < kj < (a^ + l)n. Then observe that 

J-1 , , n-l J-1 



J2 sup K-)-M^)| 2 ) 1/2 <£(£ sup \hik + -)-hia ]+1 + -)\ 2 fl 2 
z — f fcez n n t—f z — ' fcez n n 

J = l kj<k<kj +1 i= U J=l o J -<fc<o J - +1 

n-l 

+ E(Ei^ +1 )-^+-)i 2 ) 1/2 - 

* * T) T) 



<.3=0 feeZ 



Using this inequality and a dilation argument, Theorem 3.14 will follow immediately 
from the following two results. 



3 It will become clear in Section 5.3 that the result of Theorem 3.14 for any particular p suffices to 
imply Theorem 3.10. 

4 This type of approach has been used before in ergodic theory, see for example [11]. 
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Theorem 3.15. Let K : R — > R be an L 2 kernel satisfying (6), (7) and (8). Then for 
each 1 < p < oo there is < e(p) < | siicn £na£ /or each finite sequence of integers 
h < k 2 < . . . < kj 

1/2 



J-l 



sup 

l9lli2 (R) : 



1 II E s fc ^ I / /( x + ^ + y^ m ^ x - Dil ^ + i ) d y\ 2 lUiw 



l£(R) 



|iP(R)> 

with the implicit constants depending only on p. 

Theorem 3.16. Let K : R — > R &e an L 2 kernel satisfying (6), (7) and (8) and iae extra 
requirement 

1^(01 <ie (12) 

Tnen /or eaca 1 < p < oo the following inequality holds 

1/2 



SU P II (El / f(x + y)g(z + y)(D^ k K)(y) dyA || Li(R) 



< 11 ! 'ILP(R), 



mt/i the implicit constants depending only on p. 

Our approach to theorems 3.5, 3.15 and 3.16 relies on using time-frequency techniques to 
bound discrete model operators. This amounts to decomposing the time-frequency plane 
into dyadic rectangles s = I s x u s (also called tiles), associated with highly localized wave 
packets <f) s (x, 9), tp s (x). The decomposition is guided by the nature of the operator under 
investigation, and the goal is to reduce the proof of its boundedness to that of the discrete 
model sums 

E^' <Ps)<f>s(x,6) 

s 

in some appropriate norm. Our proof of Theorem 3.5 has emerged from the discovery of 
striking connections between the model operator for the return times operator and the 
Carleson-Hunt 's operator 

Cf(x,9) :=p.v. [ f{x ~ V) e^dy 



In y 

which controls the convergence of the Fourier series. To clarify this connection we in- 
troduce some notation. For each 1 < p < oo, the M p multiplier norm of a function 
m : R — > R is defined as 

IHk(R)= \\m(9)\\ Mp9{R) := sup || [ m(9)h(9)e 2mex d9\\ Lm) . 

\\h\\ p =i J 

Of course the M 2 (R) norm is just the L°°(R) norm, ||m|| M2 ( R ) = ||ra||.£,oo( R ). Similarly, 
the M* norm of a sequence of multipliers : R — > R is defined as 

sup | J m k (9)h(9)e 2mdx d9\ 

The celebrated theorem of Carleson-Hunt asserts the following: 



||("ik)kez||ju;(R) = ||("ifc(0)) fc6 z||M; ifl (R) := SU P 

\\h\\ v =l 



BREAKING THE DUALITY IN THE RETURN TIMES THEOREM 



11 



Theorem 3.17. For each 1 < p < oo and each f E L P (R), 

||||C/(x^)||^ (R) || Lg(R) <||/|| LP(R) 

or equivalently 

||I|c/(^ ) 6 | )IIm 2 , £) (r)|| l p (r) < II/I|lp(r) 

It turns out that there is an appropriate choice of wave packets <p s and ip s such that 
Theorem 3.17 can be reduced to showing that 

< H/IUnR), ( 13 ) 

iS(R) 

while Theorem 3.5 can be reduced to showing that 

IIIK Yl (f>Vs)Mx,0))kez\\Ml g (R)\\LP{B.) < ll/llp- (14) 

s:|/ s |>2 fc 

The proof of Theorem 3.15 relies on the same techniques as the ones utilized in Theo- 
rem 3.5, with an extra twist created by the oscillations of the operators in question. The 
main new ingredient here is Theorem 8.11, whose estimates incorporate both the maximal 
and the oscillatory behavior of the multiplier. 

In contrast, Theorem 3.16 does not encode any maximal or oscillatory behavior. Its 
proof does not need any new ingredients, other than the ones we use to produce an 
(implicit) proof of the Carleson-Hunt theorem. 

The main novelty of our approach in this paper resides in getting local type of estimates 
for the model operator, as opposed to proving global estimates via dualization. This latter 
strategy was successful in dealing with maximal operators of similar complexity, as those 
in [17], [18], [22]. Our search for this new type of approach was guided by the the nature 
of the M.2 norm, which makes the dualization of (14) extremely hard to handle. We thus 
had to develop a set of techniques that do not involve the dual of the M| norm. We note 
that the Mi norm is much more amenable to dualization. This fact was observed in [24] 
in the context of the Carleson-Hunt operator, where dualization of the M 2 norm was used 
to create an interplay between energy and mass. 

Here is an overview of our proof of inequality (14). In Section 6 we indicate how to 
reduce theorems 3.5, 3.15 and 3.16 to similar statements about discrete model operators. 
The details for our main result, Theorem 3.5, are as follows. For each scale k G Z we 
further decompose the model operator J2 s .\ Is \ >2 k (f, <£ s ) ( Ps( x , 0) hito the sum of two distinct 
operators with good frequency localization. 

The first one is controlled by a weighted version of the aforementioned maximal mul- 
tiplier result of Bourgain, in which the multiplier assumes different values depending on 
k and on the frequency base point. The proof of this result is presented in Section 8 and 
its later application depends on variational estimates proved in Section 9. The second 
operator is essentially a composition of the original Bourgain's maximal operator and 
Carleson's operator, and as a consequence its boundedness depends on the boundedness 
of these two fundamental operators. 

Our analysis of the return times operator is then guided by time localization, in that for 
each x on the time axis we split the contribution coming from various trees in terms of their 
spatial localization with respect to x. We then get pointwise -rather than global LP norm- 



^(/, ¥> a )&(M)||Af 2 , fl (R) 
s 
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estimates for the model operator at each x outside an appropriately chosen exceptional 
set. One immediate advantage of this type of localization is that it reduces substantially 
the combinatorial difficulty of organizing the trees into structured subcollections. Indeed, 
the contribution to a given x on the time axis will essentially come from just one stack 
of trees. The fact that x is chosen outside the exceptional set will guarantee control both 
over the number of trees in the stack (which makes the weighted Bourgain's multiplier 
result effective, see Section 8) and over the size of the weights (via BMO estimates, see 
Section 9). The remaining details of the proof are then presented in the last two sections 
of the paper. 

This new method of estimating the model operator locally, as opposed to the previously 
employed global approach, has first led us to a new proof of the Carleson-Hunt theorem. 
One which is in the spirit of Carleson's original argument in that it uses energy but not 
mass, however it uses a completely different mathematical language and set of tools. This 
proof is incorporated in the main argument, and is used to control the second operator 
mentioned above. 

An approach to the Return Times theorem in the case 1 < q < 2 along the lines of 
Theorem 3.5 would involve estimates both on the M* norm of the weighted Bourgain's 
maximal multilinear operator in Theorem 8.7 and on the M q norm of the model sums 
associated with Carleson's operator. Crucial to our proof of the case q = 2 in Theorem 3.5 
is the fact that the norm of the first operator is small as a function of the number 
L of frequency basepoints 5 . The M* norm is significantly larger when q ^ 2. More 
precisely, it is shown in Section 8 that this norm is at least of the order of L' 1 / 2-1 ^' for 
each q G (1,2) U (2, oo). 

On the other hand, the L dependency of the M* norm for 1 < q < 2 is of at most 
L 2//<?_1 , which is what one gets by interpolating with the Ml norm. Even with this large 
bounds our methods still seem to produce partial results in Theorem 3.5 for other values 
of q, assuming good control over the the M q norm of the model sums associated with 
Carleson's operator 6 . This will appear elsewhere. 

4. The approximation argument 

Let (Yi, Ti, Ui), % — 1, 2, be some arbitrary Lebesgue spaces. Denote by C(Yj) the family 
of all the z/j-measure preserving transformations on Yj. Equip C(Y^) with the topology 
of weak convergence, in which r s — > r if and only if Ui(T s AArA) — > for each A G J-{. 
We will also denote by C(Y"i,F 2 ) the set of all invertible, bimeasurable transformations 
(3 : Y 1 — > Y 2 which take the measure v x to the measure v 2 . The following result is due to 
Halmos [21]. 

Lemma 4.1. If Yi = (Y±, T\, vi, ai) is ergodic then the set 

{/3a 1 /5- 1 ,/3GC(r 1 ,r 2 )} 
is dense in C(Y 2 ) in the weak topology. 

5 The bound obtained in Theorem 8.7 is of the order L e , for arbitrarily small e > 0. Any improvement 
over the trivial bound of L 1 / 2 produces positive results for some range of p < 2 and the fact that the 
bound is actually L e extends the result to the full range 1 < p < oo. While the L € bound suffices for our 
applications here, it would be interesting to know its correct order of magnitude. 

6 This is currently investigated by the last two authors here together with other authors. 
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Consider now a sequence Sn of weighted operators acting on the measurable functions 
in each system, according to the formula 

PN 

Sn9{v) ■= w N,n9(v?y), 

n=r N 

where the weights {iVN, n } are arbitrary complex numbers. Denote by S* the maximal 
operator S*g = sup N \Sn9\- The following version of the so called Conze's principle is a 
consequence of the above lemma (see [15] for a similar version of this result). 

Theorem 4.2 (Conze's Principle). Let Yj = (Yi, Ti, i>i, i — 1,2, be two dynamical 
systems, with Yi ergodic. Then for each 1 < q < oo 

sup ||S , *5f|| L9 (Y l) > sup ||S , *5f|| L9( Y 2 ). 

||fflU«(y 1 )= 1 Il9lli9(y 2 )=l 

In particular, if both systems are ergodic then the left and right hand sides are equal. 

We use this to prove the following general approximation result. 

Theorem 4.3 (The approximation argument). Fix some 1 < p,q < oo and consider 
the dynamical systems X = (X, r) and Y = (Y, J 7 , v, a), where the second one is 
assumed to be ergodic. Consider a sequence of bilinear operators defined as 

N 

T N (f,g)(x,z):= £ w N , n f(r n x)g(a n z) 

n=~N 

for each dynamical system Z = (Z,T,m, p), each f e L P (X) and g e L q (Z). Assume 
that 

sup || sup \T N (f,g)(x,y)\\\ L ^ Y) <\\f\\ L p(x), (15) 

\9\\li( Y )=i N lPjx) 

Assume also that for each function f G L°°(X) there is a universal set X C X with 
/J,(X ) = 1, such that for each Z = (Z, T,m,p), each g e L°°(Z) and eac/i x G X , the 
sequence 

T N (f,g)(x,z) 

converge for m- almost every z. Then the last statement above also holds for each f G 
LP(X) and each g G L q (Z). 

Proof For each / G L P (X) and each x G X define 



R*f(x) ■■= sup sup ||sup|Tjv(/,0)(a;,z)||| £ ji 



Z \\o\\ Lq(z)=1 N 



(Z), 



where the first supremum above is taken over all dynamical systems Z. Note first that 
Theorem 4.2 implies that 

R*f(x)= sup \\sup\T N (f,g)(x,y)\\\ L ^ Y) . 

\\9\\l<i(y)=i n 

Second, for each g G L q (Y) the quantity 

II sup \T N (f,g)(x,y)\\\ L « {Y) 

N 
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gives rise to a measurable function of x, by Fubini's theorem. Third, since L q (Y) is 
separable it follows that for each x the latter supremum can be taken over a fixed countable 
family of functions g n which is dense in L 2 (Y). With these observations, the fact that 
R*f(x) is a measurable function of x follows immediately. Moreover, (15) implies that 

\\R*f(x)\\LP( X ) < Wfhnx). 

Fix / G L p (X). Let £ G L°°(X) be such that ||/ - fi\\ L p(x) -> 0. For each i denote by 
X 0j j the universal set corresponding to /j. Define Xq := nXo,i and note that it has full 
measure. For each dynamical system Z as above and for each g G L q (Z) let ^ G L°°(Z) be 
such that H^Hl^) < 2 and ||g — ^||l9(z) — *■ 0. Now for each x G Xq and each g G L Q (Z) 
with ||(?||l<7(z) = 1 we have 

|| limsup|T JV (/,y)(x,z) -T M (/,^)(x,z)||| L | (z) 

N,M— »oo 




< 4inf - + 2inf iT/(x)||<7 - ^|| LP(Z) 

= 4 inf R*(f-fi) (x) 
We deduce that 

|| sup sup \\\imsup\T N (f,g)(x,z) -T M (f,g)(x,z)\\\ L i z{z) \\ L p {x) 

z I|9||m(z)=i N,M~>°o 

<4inf||it:*(/-/ i )(x)|U S(x) 

<mf||/-/ i || £ P W = 

The universal set X associated with / is obtained as the intersection between the set X$ 
and the set of those x G X for which 

sup sup \\\imsup\T N (f,g)(x,z) -T M (f,g)(x,z)\\\ Li{z) = 0. 

z llsllz,9(z)=i W,M—oc 



5. Transfer to ergodic theory 

We first sketch the argument on how inequalities (10) and (11) imply their counter- 
parts in ergodic theory, that is (4) and (5), respectively. In the end of the section we 
prove that Theorem 3.14 implies Theorem 3.10, and indicate how a similar argument and 
Theorem 3.14 imply yet another proof of Bourgain's Return Times theorem. 

5.1. Transfer for maximal averages ( (10) (4) ). Fix some : Z — > Z + with 
finite support. For each a G Z, denote with C(<p)(a) the best constant which makes the 
following inequality true for each finitely supported ip : Z — > Z + 

1 

N 6=0 

We claim that for each 1 < p < oo we have 

l|C(0)llaz)<H0ll^(z), (16) 
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with bounds independent of 0. To see this, for each and ip as above define / : R — > R 
with f{x) := 0([x]) and g : R — * R with := V'(N)- Note that for each a < x < a + \ 
and each c<^<c+|we have that 

Y N ~ 1 i rN 

J2<l>(a + l>Mc + b)<-\ / f(x + y)g(z + y)dy\, 
b=o ^° 

uniformly in x,z,N. Note also that ||0||zp(r) ~ ||/||lp(r), IHI/ 2 (z) ~ IM|l 2 (r)- It turns 
out that 

1 N ^ 1 f* 

II SU P T7 y^( a + & Mc + &) ||j2 (z) < inf || sup -| / /(x + y)#(z + y)dy||| L2(R) , 

AT £^ a<x<a+5 t>0 t Jo 

and so 

1 /•* 

C((j))(a)< mf sup || sup -| / f(x + y)g(z + y)dy\ || L 2 (R) , 

a<x<a+i ||s|| i 2 (R) =l *>0 1 JO 

which upon using (3.8) finalizes with 

1 f f 

\\Cw)\k(z) < || sup || sup -| / f{x + y)g(z + y)dy\\\ L 2 {R) \\ L P {K) 
h\\ L 2 iR) =i <>o J ./o 

< 



rv, ||J ||LP(R) 
< 



JJ>(Z)- 



Consider next two dynamical systems X = (X, r) and Y = (Y, JF, z/, a), where 
the second one is assumed to be ergodic. Fix some large K > 0, a positive function 
/ G L P (X), and the point rr G X. For each < a < y and each y EY define C(a,x,y) 
to be the smallest constant for which 

1 

E ( SU P ^J2f( Ta+bx )9(° C+b y)) 2 <C 2 (a,x,y) £ /(^y), (17) 

0<c<A:/2 N ^ K / 2 fe=0 0<n<E" 

for each positive function g G L 2 (F). It is an immediate consequence of (16) that 

supCia,x,yY< f P ^x). (18) 

0<a<K/2 yeY 0<n<K 

To see this it suffices to apply (16) to the functions 0, : Z — > Z defined by 

J/(r n x) : 0<n<X U<r n j/) : 0<n<X 

0m) := < , ipin) := < 

j : otherwise j : otherwise 

By integrating with respect to y in (17) we get for each x, < a < K/2 and each 
9 e L\Y) 

is r i N ~ l r 

[ + 1] / ( sup -J2 f{r a+b x)g{a h y)) 2 dy <[K + 1] sup C(a, x, y) 2 / <? 2 (y)dy. 
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Given the universality of C(a, x, y) we get 



N-l 



snp /(sup 1 f(r a+b x)g(a b y)) 2 dy < sup C(a, x, yf. 

Mlz,2 (y) =i J N<K/2 ^ yeY 
s>o 

Combining this with (18) we get 

* — n n ^ v , ^ is 



0<a<K/2 ll9 "i 2 (y) = 

' 9>0 



0<n<X 



Integrate the above with respect to x and divide by K to get 

N-l 

sup 



/ sup ( /( sup 1 J^f(T b x)g(a b y)) 2 dyy/ 2 dx < [ f>(x)dx. 

J ll9llr2,vM =1 J N<K/2^ TZ J 



11911^2(^=1 J N<K/2- b=Q 

9>0 



Finally, let K — > oo and use the Monotone Convergence Theorem to conclude that 

N-l 



sup ||su P ^iE/( r6;r MA)IIU2(R 



\9\\l2{y) : 



=1 N 



N 



6=0 



^ \\J\\LP(R)- 



Lg(R) 



Note that this together with Theorem 4.2 immediately imply (4). 

5.2. Transfer for maximal truncated series ( (11) =>- (5) ). The transfer from (11) 
to (5) involves similar steps. We start by first observing the following immediate conse- 
quence of Corollaries 3.8 and 3.9: 

Corollary 5.1. For each 1 < p < oo and each f G L P (R) we have: 



sup || sup 



N 

E 



\9\\ L 2 (R) =i n n=1 n 



n+l 



-n+1 



f(x + y)g(z + y)dy- / f(x + y)g(z + y)dy)\ || L 2 (R) 



< 



LP(R)- 



Proof First note that Corollary 3.9 implies that 

sup || sup | f f(x + y)g(z + y)—\\\ L 2 (n) 

I9IIl2(R) =1 J^-<\v\<n V 



< ||/||lp(r), 1<P< oo. 



iS(R) 



Also, 



V - / f{x + y)g{z + y)dy - / /(x + + y) 

^supt" 1 / |/(a; + y)#(z + y)|(fy, 

t>0 J-t 



1/ 



and thus Corollary 3.8 finishes the proof. 
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Fix again some : Z — > Z with finite support. For each a G Z, denote with C(0)(a) 
the best constant which makes the following inequality true for each finitely supported 
rj) : Z -> Z 

'(a + 6)^(c + 6). 

sup 1 



A? 



1 W 

1 JV ^ 



<C(0)(a)||V||/ 2( z). 

' C 2 (Z) 



6=-JV 

We claim that for each 1 < p < oo we have 

l|C(0)(a)ll/g(z) < IHkz), (19) 

with bounds independent of <f>. To see this, for each and ip as above define /, g : R — > R 
with 

r ^H) = [*M<*<M + 5 (a . ) = JV(M) : W + I<^<N + | 
: otherwise " ' ' | : otherwise 

Note that for each a < x < a + and each c<z<c+j^we have that 



X , s , , , s AT 



b ti n Jn J-n 

uniformly in x,z,N. Note also that ||0||«p( Z ) ~ ||/||lp(r), IMh 2 (z) ~ IMIz^r)- Inequal- 
ity (19) follows as before, by using Corollary 5.1. The transfer from Z to dynamical 
systems follows exactly the same path as in the case of averages. 

5.3. Transfer for the pointwise convergence (Theorem 3.14 =>- Theorem 3.10 ). 

We first observe that it suffices to prove the convergence of the series in Theorem 3.10 
along a lacunary subsequence. Indeed, fix some / G L°°(X) and assume that for each 
di = 2 1//j , % G N, we know that there exists a universal set X, L C X with fi(Xi) = 1, 
such that for each second dynamical system Y = (Y, JF, u, a), each g G L°°(Y) and each 
x G Xi, the limit 

v^' f(r n x)g(a n y) , , 

lim > ^ ^ — ^ 20 

-df<n<df 

exists //-almost everywhere. Let X be a subset of X of full measure such that \f(r n x)\ < 
\\f\\L°°(x) for each x E X and each n G Z. We then use the boundedness of both the 
weight and the test function to argue that for each g G L°°, for each x G X, for each 
i G N and for almost every y G Y we have 

n=—N n=—M 



+ Clog di||/||L<»(A-)||^|U<»(A-)- 



18 CIPRIAN DEMETER, MICHAEL LACEY, TERENCE TAO, AND CHRISTOPH THIELE 

Since % can be chosen arbitrarily large, for each x E X := flieN^* H X we get that 



( N i 
limsup y 



N M 

f(r n x)g(a n y) _ f(T".r)<j(a" 



n 



n 



0, 



N n=-M 

for v- almost every y. 

It remains to prove that the convergence of the subsequences in (20) follows from 
Theorem 3.14. To ease the exposition we will restrict the attention to the case d = 2 
(that is i = 1). The argument for general % poses no further difficulties. Let K be a 
C°°(R) kernel which satisfies the requirements of Theorem 3.14 and in addition satisfies 
K(x) = - for \x\ > 1. Introduce the kernels H k : R — > R, k > 1 (these are rough versions 
of the kernels Di\\kK) defined by the formula 

-2 k <i<2 k -l ~ ~ ieZ\[-2 fc ,2 fc -l] 

Take an arbitrary sequence k\ < ki < . . . < kj of positive integers. Let e(2) be such that 
Theorem 3.14 holds when p = 2 and d = 2. As a consequence of this theorem we get that 
for each f e L 2 (R) ' 

'J-l f \ V2 

2 



lfllll,2(R) = 



|L2(R) 



sup II I sup I f( x + y)9(z + y)(H k (y) - H kj+1 (y))dyf 

<^ (2) II/IU 2( R), (21) 
with some universal implicit constant (independent of J, in particular). Indeed, note that 

y\ < 2 fc 



l^-Dil^^y)! < 



l 

2 2fe 5 

1 



\y\ > 2 



with the implicit constant independent of A;. From the boundedness of the maximal 
averages (Corollary 3.8) we deduce that 

1/2 



SU P_ II (El / f(x + y)g(z + y)(H k (y)-Di\ 1 2k K(y))dy\' 



|Li(R) 



< 



This together with the inequality in Theorem 3.14 and the fact that the terms kj are 
positive proves (21). 

The next step consists of transferring (21) to integers. By following the same lines like 
in the previous subsections, that is by considering functions /, g : R — > R with 



/(*) := 



<MM) = [x] + i<x<[x] + \ 



g(x) :-- 



: otherwise 
we get that for each <p : Z — > Z with finite support 



^(N) = [x\ + \<x<[x\ + \ 
: otherwise 



'j-i 



1/2 



sup || [J2 sup \J2<P(cL + b'W(c + b)(H k (b)-H k]+1 (b))\ : 



Il2 (r) =1 \ j=1 k 3 <k<k 3+1 b( . z 



«(Z) 
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<^ (2) IHhz), (22) 

where the first supremum above is taken over all finitely supported functions ip : Z — > Z. 
For each k > 1 introduce the kernels A fc : Z — > Z and : Z — > Z defined by 

AS):={ "2*<i<2 fc j 5fc(i):= { i -2 fe <^<2 fe , 

| 0, otherwise j 0, otherwise 

and note that for each k <kl 

H k — H k > = O k — O k i := (A k — S k ) — (A k i — S k >). 

Thus (22) gives 

/J-l _ \ V2 

sup || ^ sup |^0(a + 6)^(c + 6)(O fc (6) - O fcj+1 (&))| 2 J ||, i(z) 



Hl2 (r) =1 \ j= i kj<k<k j+1 bf . z 



»2(z) 



P(Z), 



where the first supremum above is taken over all finitely supported functions ■?/> : Z — > Z. 
Standard transfer to a dynamical system X = (X, £,/x, r), as described earlier, leads to 



J-l \ V2 

2 



sup sup || sup |^/(r n a;)^(a n y)(O fc (n) -O fe . +1 (n))| 

(y,^,i/,CT) \\g\\ L 2 (Y) =l \j=i kj<k<k j+1 neZ 



\Ll(Y) 



Ll{X) 



<^ (2) II/IIl 2( r), (23) 

with some universal implicit constant, where the first supremum is taken over all possible 
dynamical systems Y = (Y, J 7 , u, a). It is then easy to see that this implies the following 
statement: 

(S): For each function f G L°°(X) there is a universal set X C X with /x(X ) = 1, 
such that for each second dynamical system Y = (y,T,v,o), each g G L°°(Y) and each 
x G Xq, the weighted averages 

J2f(r n x)g(a n y)O k (n) 

neZ 

converge v- almost everywhere as k — > oo. 

To see this, assume for contradiction that the above fails for some / G L°°(X). It follows 
that there is a measurable set X' C X of positive /x measure, such that for each x G X' 
there is a system Y x , a function g x G L 2 (Y X ) with Hg^y)!^^) = 1 and a(x),(3(x) > 
such that 

lwLswp^2f(r n x)g x ((T n y)O k (ri) -limirt^ f(T n x)g x (a n y)O k {n) > a(x) 
k ^°° nez nez 

for y in a set of z/ measure (3(x). An elementary measure theoretic argument shows that 
one can choose a set X" C X' of positive /x measure such that a(a;) > ct and f3(x) > f3 for 
each x G X", for some a,(3 > 0. A similar argument shows the existence of set X'" C X" 
of positive /x measure and of a sequence of positive integers (fcj)jeN sucn that 
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sup 



a 

> 7T> 



J2f(r n x)g x (a n y)Ok(n) - £ f(r n x)g x (a n y)O kj+1 (n) 

n£Z nSZ 

for each j G N and for each (x, G X w x where v(Y£) > (3. We immediately get that 
for each J 



sup 



sup 

Ili2(y) = 



£ sup |^/(r^Ky)(O fc H-O fej+1 H)| 2 ) 

w =l kj<k<k j+1 neZ y 



1/2 



which together with the fact that e(2) < | contradicts inequality (23). The reader is 
referred to Section 4 for measurability issues regarding the selections of the various sets 
in the above argument. 

The last portion of the argument is devoted to proving the statement (S) for the 
weighted averages where A k (n) replaces O k (n). This will follow from Bourgain's result 
for standard averages, Theorem 1.1, by means of a common averaging procedure described 
below. We analyze the two one-sided sums separately, since the mean zero property is no 
longer crucial in this case. Note that in particular for each k > 1 

2 fc n 

f(r n x)g(a n y)A k (n) = £ n(A k (n) - A k (n + 1))(- £ f{i i x)g{a i y)). 



n>l 



n=l 



n 



i=i 



By using Bourgain's result, the fact that 



lim n{A k (n) - A k (n + 1)) = 

k^oo 

for each n > 1 and the fact that 

sup^ \n(A k (n) - A k (n + 1))| < oo, 



k>0 



n>l 



it follows that we have the return times result for ^ n> i f{T n x)g(a n y)A k (n). A similar 
argument works for J2 n <-i fi^^di^y)^-^). We also "trivially have the same result for 



f(T°x)g(<j°y)A k (0) = ^-f(x)g(y). 



This ends the argument. 



5.4. Proof of Bourgain's Return Times theorem (Theorem 3.14 Theorem 1.1 

) . The argument goes as in the previous subsection. The only difference is that this time 
we apply Theorem 3.14 for each % e N to a C°°(R) kernel K { which equals 1 on [—1, 1] 
and on T \ [— 1 — |, 1 + |], and which also satisfies H-fQll^oo < 1. The error term caused 
by the restriction of Ki to 1 < \x\ < 1 + \ is 0(4), and hence can be eliminated by letting 
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6. Discretization 

We begin this section with the definition of a (saturated) grid. 

Definition 6.1. A set Q' of intervals each with length in the set {2 k : k G Z} is called a 
saturated grid if 

(1) for each k G Z there exists o(k) G R such that [o(k) + n2 k , o(k) + (n+ l)2 k ] G Q 1 
for each n G Z 

(2) for every I, V G Q' with I fl I' ^ we have that either JC/' or I' C. I. 
If only the second axiom is satisfied then we call Q' a grid. 

The endpoints of the intervals in the grid are called dyadic points. We note that if Q' 
is a saturated grid, then for each interval u = [a, b] G Q', the subintervals U\ = [a, &]! := 
[a, ^p] and U2 = [a, 0)2 :— [^,&], called the sons of u; are also in Q 1 . We define the 
descendants of as the collection of all element of Q' which contains its sons, the sons of 
its sons and so on. In general, the intervals on the frequency axis will be referred to by 
the letter uo while those on the time axis by the letter /. 

The standard saturated grid S is defined by 

S := { [Tl,2\l + 1)] : i,l G Z}. 

We will be interested in the following types of grids on the frequency axis: for each odd 
integer iV > 3, < j < N - 2 and < L < N - 1 the collection 

:i = j (mod N - 1), I G zj 

is a grid, as it easily follows from the fact that 2 N ~ 1 = 1 (mod N). It is not in general 
a saturated grid, since the first requirement in Definition 6.1 is only satisfied for k = j 
(mod N — 1). However, one can easily turn Qnj,l into a saturated grid denoted by Q'nj,l 
by adding all the descendants of the intervals already in the grid. Another interesting 
observation concerns the fact that for each fixed N the grids Gnj,l are pairwise disjoint, 
for < j < N - 2 and < L < N - 1. 

Fix now a kernel K as in Theorem 3.5. For each / G L°°(R) with finite support and 
each x define the operator 

T f , x , K g{z) :=sn v^: 

Note that we have to prove 

||||7> ) x,A-||l3(R)-»L3(R)|| l p (r) < ||/IU"(R)- 

Choose T] : R — > R such that is a C°°(R \ {0}) function which equals lim^ + 011 
(0, |] , lim^ - K(£) on [— 1,0) and outside [—§,§]■ The two limits exist due to the 
fact that < 1 for ^ 7^ 0. It suffices to prove 

||ll 7 >,x,»j||i^(R)-»^(R)|| £ p( R ) ~ 11/IUnR) ( 24 ) 



'N,j,L ■- 



»4 



1 /(x + y)^ + y)Dili fc X(y)dy . 



(25) 
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The proofs for the above inequalities will follow from a more general principle, as explained 
below. The crucial property of the multiplier K — rj that will be used later is the following 

l^^(OI<^min{|£|l}, n>0. (26) 

Note that the additional inequality \K — r}(£)\ < |£| for £ ^ is a consequence of the fact 
that < 1 for £ ^ 0. Write 

00 t 

K-vki)= E K -v(tM%)> (27) 

j=-oo 

where q is some Schwartz function supported in the annulus | < |£| < | such that 

5>(|) = 1 > 

iez 

As a consequence of (26), each function gj = K — r)(£)q(-^) will satisfy 

d n 2 _ I J 'I 

for all n > 0, uniformly in j G Z. It follows that that each function Dil^j gj satisfies 

II^Dil 2 -^(0IU r( R)< 2-^1 £^0, 

for all n > 0, uniformly in j G Z. Moreover, it is supported in the annulus | < |£| < |. 
Since the operators Tf ;X; g. and Tj )a , Dil i ^ coincide, inequality (25) will immediately follow 
if we prove that 

||ll r / ) x ) tflUj(R)-»Lj(R)|| Lg( R) ~ 11/IUnR)' ( 28 ) 
uniformly in all Schwartz functions ip supported as above and satisfying 

||^(0IU r( R)<l (29) 

for all n > 0. 

From now on ip will be either a function as above or the function rf. We next focus 
on proving (28). By a dilation argument we can assume in addition that ^(0 ~~ ^(20 is 
supported in the annulus < |£| < |. Triangle's inequality further allows us to assume 
that the support is inside |x]. Note that 



^(2 fc = E^(^)' 



i>k 



with := ^(2*£) — -^(2 l+ £) supported in [^2 ! ,|x2 *]. For each / as above and for 
each x we have 



|?>,*,tflU3(R)-L2(R) = sup 

llfll|2=l 



su p 1 / f( x + y)9( z + y)^i(y)dy\ 

kez t>k Jn 



L2(R) 



(30) 
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Pick a Schwartz function ip such that <p is supported in [0, and satisfies the following 



property for every £ G R: 



E 

l€Z 



I. 



For each scale i use the following expansion for /, valid in every L P (R), 1 < p < oo norm 

m,leZ 

where y?i, m ,« is the modulated wave packet (see [23] for a similar expansion) 

:=2-Mr i i-m)e wrt '. 

Now 

sup|J^ / /(a: + y)tf(* + y)4>i(y)dy\ = sup | £ / (f,^ i)^ i(x + y)g(z + y)4> i (y)dy\ 



m,iSZ 



= sup F fc)X 
feez 1 

where F denotes the reflection F(y) := F(—y) and 

FkM : = J2 J2 (fiVi, m ,^)Vi, m ±{x + y)^i{y)- 

i>k m,l£Z 

With this notation, the inequality (9) follows from 



\\{Fy{Fk,x))k£z\\MZ e (n,) 



< 



l£(R) 



LP(R)- 



(31) 



Here we use M^ e (H) to denote the maximal multiplier norm M| in the 6 variable. Note 
that the Fourier transform of Fk, x {y) in the y variable is 

*y(*V,)(0) = £ £tt¥Wr> / ^o(2 l (^-0)^ mi3 L(Oe 2 ^. 



i>k m,/£Z 



Define i>ro>A (M) := f R M*(6 - 0)^,^(0^^ and note that 

^(x,e) = 2-^0,0(2*5 - ^2- - m)^^^" 4 -™). 
The function o ,o,o is in C°°(R x R), and as a consequence of (29) satisfies the following 



(32) 



d n d ri 



-0o,o,o(M)IU<»(fl) 



0)^im-^-(0 are localized as follows: 



< 



(1 + 1*1 



, Vn,m,M > 0. 



(33) 



The function (f) im j_(x, 9) and its a; Fourier transform ^(^ m j_(x, #))(£) = ip(2 l (9 



supp e (0 i>m « (x,0)) C 



, ,1 o-^ + 2 
41' 41 



+ 



2-' f — + — ) .2" 
1 41 16 ' 



o-iJ_ -i§ 

^ 16'^ 8 



Z + 2 3 
~4T~ + 8 



, for each x (34) 



24 



CIPRIAN DEMETER, MICHAEL LACEY, TERENCE TAO, AND CHRISTOPH THIELE 



supp ? (J^(0 i m ^(x,#))(O) C 
The crucial property of these supports is that 
2" 



9-1 0-i[±l 

41' 41 



and 



1 1 , 



2-l,2-i±i 
41' 41 



/Z + 2 3V 






C 







2-^,2- 



, for each 0. 



(35) 



41 



41 



C 



2-^,2- 

41 ' 



I - 18 



41 



+ 1 



where u} ijt := [2 _i ^p,2 _i + l)] is in some (unique) grid G4i,j,L- 

To each m,i,l 6 Z we associate the tile s = [2 l m, 2 l (m + 1)] x u;^ and use the notation 
fs '■— <Pim J_) <f>s : — 0j m x- As a consequence of (32), (33), (34) and (35), the localization 
and decay of <p s can now be summarized as follows: 

supp e (0 s (a;, 6)) C w Sj 2 for each x (36) 



supp c (.F x (0 s (a;, 0))(f)) C w S) i for each 



sup 



<9™ 9 r 



[0 s (a;,0)e- 2 — ] 



uniformly in s. We also note that 



<|/,|<»-m- W X M( X ), Vn,m,M>0, 



Lg°(R) 

supp(^) C w s> i 



and 



sup 



<9 n 



- [^ s (^)e- 2 — ] 



— n— o , ,M 



(37) 

(38) 
(39) 

(40) 



uniformly in s. 

For each j, L as above define a collection of tiles 

S jiL := {[2*m, 2 i (m + l)]xw: G £ 4 ij,l, m G Z, 2>| = 1} 

and note that (31) is equivalent to 

Yl (f>V')M x , e ))kez\\MZ 9 m\\L>(R)%\\f\\p- 

|/ s |<2* 

In the above we changed the restriction \I S \ > 2 fc into the more suitable for later purposes 
\I S \ < 2 fc . Note that they are equivalent. Theorem 3.5 will be a consequence of the 
following more general result: 

Theorem 6.2. Let Q' be a saturated grid and let S be some arbitrary finite subcollection 
of the set of all tiles 

Suniv := {[2*771, 2 i (m + l)] x u : uo G Q' , m,i G Z, 2*|w| = !}■ 

Consider also two collections {(fi s , s G S} and {</? s , s G S} 0/ Schwartz functions. The 
functions <j) s : R 2 — > R satisfy (36), (37) and (38), uniformly in s. The functions ip s : 
R — > R satisfy (39) and (40), uniformly in s. 
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Then the following inequality holds for each f G L P (R), 1 < p < oo 

II IK E (/>Pa)&(M))kez||jlf a yR)||LS(R) < ll/IU (41) 

ses 

|/s|<2 fe 

with the implicit constant depending only on p and on the implicit constants in (36) 
and (38) (in particular it is independent of the choice of the grid). 

The same discretization techniques immediately show that Theorem 3.15 will follow 
from the following: 

Theorem 6.3. Assume we are in the settings from the above theorem. For each 1 < p < 
oo there is < e(p) < \ such that for each finite sequence of integers U\ < u 2 < . . . < uj 

j-i 

1/2 



sup sup \r e \ J2 (f^s)H^o)wmm\ii m ) 

Mlz,2 (R) =l j = 1 U j <k<U j+1 aes 

2 u j<\I s \<2k 



<^ (p) II/I|l, ( r), (42) 

with the implicit constant depending only on p and on the implicit constants in (36) 
and (38). 

By a very similar argument, Theorem 3.16 will follow from the following: 

Theorem 6.4. Assume we are in the settings from Theorem 6.2. For each 1 < p < oo 
the following inequality holds 

HIKE I E (f,Vs)Mx,d)\ 2 ) 1/2 \\ M 2 , 9 (R)IIlS(R) ^ ||/||lp(R), (43) 

feeZ ses 

\Is\=2k 

with the implicit constant depending only on p and on the implicit constants in (36) 
and (38). 

In the remaining sections we will prove Theorems 6.2, 6.3 and 6.4. From now on, by a 
dyadic frequency interval we will understand any interval of the saturated grid Q', while 
a dyadic time interval will continue to refer to an interval in the standard dyadic grid. 

7. Trees 

We now recall some standard terminology concerning trees of tiles, (see [23] and [27] 
for more details) 

Definition 7.1 (Tile order). For two tiles s and s' we write s < s' if I s C I s i andu s i C u s . 

Definition 7.2 (Trees). A tree with top T e S univ is a set of tiles T C S such that s <T 
for each s G T. Fori = 1, 2, we say that an i-tree is a tree T such that uot Q w s ,i for each 
s G T \ T , where the intervals oj s ^ and u St 2 are the left and right halves of uj s . 

We will also encounter a more general instance of a tree called "quasitree" . 
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Definition 7.3. A quasitree with top (It,£t), where J T is an arbitrary (not necessarily 
dyadic) interval and £t & TL is a (not necessarily dyadic 7 point, is a set of tiles T C S 
such that I s C It and £t G w s for each s G T. If i = 1,2, an i-quasitree is a quasitree T 
such that £ T G uj Syi for each s G T ; where the intervals uj Sj \ and are the left and right 
halves of u s . 

Remark 7.4. Note that each tree T with top T is a also a quasitree with top (/,£), for 
each interval It Q I and each £ G ujt which is not a dyadic point. We will adopt the 
convention that It = It and £ T G u>t,i, without any further specification on £ T . 

The standard decomposition of a quasitree T with top (It, £t) is the splitting of T into 
the 1-quasitree 

T« := {s G T : £t G u 8>1 } 

and the 2-tree 

T( 2 ) := {s G T : £t G cu s , 2 }. 

Note that if T is a tree with top T then this decomposition does not depend on the choice 
of £ T G ut,i, and moreover, if T G T then T G T^. 

Definition 7.5. Fix some f : R — > R. For a /inzte subset of tiles S' C S define its size 
relative to f as 



dze(S') :=sup(^^|(/,^)P 



size( 

where the supremum is taken over all the 2-trees T C S'. 

We recall two important results regarding the size. 

Proposition 7.6. For each 1 < t < oo, each 2-tree T with top T and each f G £'(R) we 
have 

1/2 



Proof See for example Lemma 1.8.1 in [30]. ■ 

The following Bessel type inequality from [24] will be useful in organizing collections of 
tiles into trees. 

Proposition 7.7. Let S' C S be a collection of tiles and define A := [— log 2 (size(S'))] ; 
where the size is understood with respect to some function f G L 2 (R). Then S' can be 
written as a disjoint union S' = [J n>A 'P n , where size(V n ) < 2~ n and each V n consists of 
a family Tv n of pairwise disjoint trees satisfying 

E l^l<2 2 1|/||^ (44) 
with bounds independent of S' , n and f. 



7 In fact, £ T may always be taken to be non-dyadic 
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In the following we will use the notation for the counting function associated with a 
collection T of quasitrees 

N T {x) := 

Let T be a 2-quasitree with top (It,£t)- The following decomposition will be useful in 
the future. For each s G T and scale / > we split <f> s (x, 6) as 

For convenience, we set (pf^ := <p s for each s G T. For I > 1 we define the first piece to 
be localized in time: 

supp j>® T (; 0) C 2 l ' l I s , for each 9 G R. 

For the second piece we need some degree of frequency localization, but obviously full 
localization as in the case of S is impossible. We will content ourselves with preserving 
the mean zero property with respect to the top of the quasitree. The advantage of 
over (j) s is that it gains extra decay in x. More precisely, we have for each s G T and each 
M > 

<f)^ T (x,6)e- 27Ti ^ x has mean zero, 6 G R, (45) 



(f)%(x, 6)e~ 27Ti ^ x is c(M)2" M/ - adapted to I s , for some constant c(M), 6 G R, 

(46) 



supp0^ T (:r, •) C uj S)2 , for each a; G R, (47) 



|MtM)I ^ 2- M Ul^y]f(:r), uniformly in x, G R. (48) 

We achieve this decomposition by first choosing a smooth function rj such that supp(^) C 
[-1/2, 1/2] and r] = 1 on [-1/4, 1/4]. We then define 

and 

^5 *) : = f moo 7 w / x)e- 2 ^*Dil$ Is r](x)dx + S (£; x)(l - Bil^x)). 

Properties (45) through (48) are now easy consequences of (36), (37) and (38). 

In the next two sections we prove some general results of independent interest, which 
will be used later in the main argument. 
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8. A WEIGHTED BOURGAIN'S LEMMA 

For each 1 < r < oo and each sequence (xk)k&z in a Hilbert space H, define the 
r -variational norm of (xk)kez to be 

lFfc||v fc r (Z) := SUp ||Xfc||ft + lFfe||y fc T '(Z) 
k 

where V k r (Z) is the homogeneous r- variational seminorm 

M 

\\ X k\\v^(Z) := SU P {/] \\ X k m — x k m - 1 \\n) 1 ^ T ■ 

M,k <ki<-<k M ~± 

We also write V k r (L) and V k r (L) for the variational norm of a sequence {xk)i<k<L °f L 
elements. Define also the oscillation norm \\ ■ \\o v of a sequence (xk) with respect to the 
sequence of integers U = (uj)j =1 to be 

J-l 

\\ X k\\o v = SUp \\x k ~ X U] \\ 2 H ) 1/2 . 

j=l u 3 <k<u j+1 

For each r > 2 define also the oscillation- variational norm 



\\ x k\\o v nv^(z) '■= \\ x k\\o v + \\ x k\\v£(z)- (49) 
For future reference we record the following easily verified lemma. 

Lemma 8.1 (Product estimates). For each k, let a*,, bk be some complex numbers and let 
U := ui < u 2 < . . . < Uj be an arbitrary finite sequence of integers. Then for each r > 1 

IK&fcllv^z) ^ IKIIv^llMv^z), 

H«fcMou ^ IKIIoull&fc||j£°(z) + ll & fe||oul|afc||^(z)- 

Consider a finite set A = {Ai, . . . , \l} such that each dyadic frequency interval 8 of 
length 1 contains at most one element of A. For each k > define Rk to be the collection 
of the L dyadic frequency intervals of length 2~ k which contain an element from A, and 
denote by uo k ,i the one that contains A/. Also, for each k > and each 1 < I < L consider 
some multipliers m^i : R — > C. Define 

Akf(x) := f ™k,i(Omy^ x dZ. 

I ^^k,l 

The following theorem is a particular case of the main result of this section, Theorem 8.7. 
Theorem 8.2. For each r > 2 we have the inequality 

||sup|A fe /||| L2(R) <L 1/2 ~ 1/r sup sup WWim^il^gY^Wvr^W ||/||l3(r), 

k>0 l ll9ll i2(R) =l 

with the implicit constant depending only on r. 
8 that is, intervals in the saturated grid Q' 
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The gain in this theorem is over the exponent of L, given the fact that the triangle 
inequality trivially implies the result with L replacing L 1 / 2 ~ 1 / r . The remaining part of 
the bound is an amorphous quantity, its later estimate will depend on the multipliers in 
question. 

The proof presented below of the above theorem relies on a couple of lemmas and is 
heavily inspired by an argument of Bourgain for a particular case (Corollary 8.9), see 
[11]. We will denote by 1 2 {L) the Hilbert space of all the finite sequences (q)i<k^. The 
following result is classical. 

Lemma 8.3. For each finite set A C l 2 (L) with cardinality %A, we have 



L 



sup | ^ aie 2mXiy \ ||l2([o,i)) < min(v / Z, y/$A) sup ||a|| { 2 

aeA aeA 



Proof (Sketch) To obtain the bound involving y/L, take absolute values everywhere and 
use Cauchy-Schwarz. To obtain the bound involving y/$A~, estimate the left-hand side by 
the square function 

(EiiE^ A ^^)iiV)) 1/2 

aeA 1=1 

where x is a bump function supported on [—1,2] that equals one on [0, 1], and then use 
Plancherel's theorem. ■ 

We use this lemma to prove the following. 
Lemma 8.4. For each set C = {ck} C l 2 (L) and each r > 2 we have 



sup 

k 



E„ „2mXiy\\\ < rl/2-l/r|i II 

c k ,ie iy |||Lg([o,i)) < L \\c k \\ V£ (L), 

i=i 

with the implicit constant depending only on r. 

Proof The proof of this lemma relies on a standard metric entropy approach. It suffices 
to prove it in the case C is finite and then to invoke the Monotone Convergence Theorem. 
For each A > denote by M\ the minimum number of balls with radius A and centered 
at elements of C, whose union covers C. It is an easy exercise to prove that 

supAM^ /r < ||c fc ||v7-(L), (50) 

A>0 

with the implicit constant depending only on r. Let c* be an arbitrary element of C. For 
each n > — log 2 (diam(C)), let C n be a collection of elements of (C — C) U {0} such that 

||c||;2( L ) < 2~ n+2 for each c G C n , 

%C n < M 2 -u + 1 

and each c G C can be written as 

c = c* + c n with c n G C n . (51) 

n>— log 2 (diam(C)) 

Here is how C n is constructed. For each n > — log 2 (diam(C)) define B n to be a collection 
of M 2 -n elements of C such that the balls with centers in B n and radius 2~ n cover C. 
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If n — [— log 2 (diam(C))] — 1 define B n = {c*}. For each n > — log 2 (diam(C)) and each 
c G B n , choose an element d G B n _i such that the ball centered at c and with radius 2 _n 
intersects the ball centered at d and with radius 2 _n+1 . Define 

C n := {c-d :ceB n }U{0}. 

Since C is finite, for each c G C there is n such that c G B n . To verify the representa- 
tion (51) for an arbitrary c G C, denote as above by d the element from B n -\ associated 
with c, by c" the element from _B n _ 2 associated with c' and so on, and note that this 
sequence will eventually terminate with c*. Hence we can write 

c =( c -d) + (d -d') + ... + c*. 

Note also that by construction, each element of C n has norm at most 2~ n+2 . 
If for each c G l 2 (L) we define 

L 

X c (y) = J2 Cie2mX ' y 
i=i 

then we have 

X c (y) = X c *(y) + X cn(v) with c n G C n , 

n>— log 2 (diam(C)) 

for each c G C. This together with inequality (50) and Lemma 8.3 further allows us to 
write 



sup|X c (y)]|| L 2 ([0)1)) < \\X c «(y)\\ L 2 m)) + II SU P l^c(y)|||L2([o,i)) 

CeC n>-log 2 (diam(C)) C€Cn 

< sup ||c||,2 (L) + ^ min(v / L, a/ M 2 -n + 1) sup ||c||,2 (L) 

CeC n>-log 2 (diam(C)) CeCn 

< sup ||c|| /2(L) + ^2-™min(v / L, ||c fc ||^%)2 nr/2 ) 
cec nez 

< \\c k \\vm Ll/2 - 1,r ■ 



Lemma 8.5. Let (uj k ) ke z be a sequence of nested dyadic frequency intervals with \u>k\ = 
2~ k and let also U := u\ < . . . < uj be a sequence of positive integers. Then for each 
r > 2 

llll / h0U(0e 2 ^ x dao v nv m \\LUn) < II/II^r), 
with the implicit constants depending only on r. 

Proof It suffices to assume that G oo k for each HZ. We will say that an interval 
[a, b] lies in the interior of the interval [c, d] if c < a < b < d, and refer to this property 
as strong nestedness. Define a sequence . . . < A;_ 2 < A;_i < k < ki < k 2 < . . . such that 
for each % the interval ou kl+1 lies in the interior of u kl and none of the intervals u) k with 
ki < k < k i+ i lies in the interior of u ki . Define also f\ by /j := l Wfc ./- 
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We first estimate the Z£°(Z) component of the variational norm. Choose some Schwartz 
function ( with l[_i,i] < C < l[-2,2] and fc> r each dyadic define Cw(£) — C( ^~|^ )- Note 
that 

sup |/* < sup sup |/i*C Wfc (aOI + \fi*lu> k (x)- fi*(u> k (x)\ 2 ) l/2 . 

k i ki<k<k i+1 i fei < fc<fei+1 

By Plancherel's formula, the square function above is bounded in L 2 by a constant multiple 
of || /|| 2- Since for each k and x G R 

|C fc (x)|<Dil^|C|(x), 

we get 

sup sup \fi *(u, k (x)\ < (supDil^|C|)(x) * (sup |/<|) 

i ki<k<kiJ r i k i 

<M 1 (sup\f i \)(x). 

i 

Finally, to control supj |/j| we note that 

sup \U{x)\ < sup 1/ * <^{x)\ + \h{x) - f * C fc »| 2 ) 1/2 , 

i i 

and then use an argument as above to conclude that 

l|sup|/i(a;)||| L 2 (R) < ||/|| L 2(R). 

i 

This shows that 

||sU P |/*L fc (x)||| L 2 (R) < ||/|| L 2(R). 
k 

The estimates for the oscillation norm are now immediate consequences of the maximal 
estimates and the orthogonality of 

fj := (flu Uj \u> Uj+1 ) ■ 

Indeed 

sup |(/i.J^-(/i^)1lli 2(R) ) 1/2 = (Xill su p iaU)'-au,)1lli 2 (R)) 1/2 

j—l uj<k<u j+1 . =1 uj<k<Uj +1 

<(£ll su p i(£urnii 2(R) ) 1/2 <(xiii/,iii 2(R) ) 1/2 = ii/iiL 2 (R). 

We next focus on the variational part || • \\ vr of the norm. For each / : R — > R define its 
Poisson integral P t / := f*P t , t > 0, where i"t(£) := e~*'^. The following is a consequence 
of the variational result of Lepingle [26] applied to the Brownian martingale associated 
with the harmonic function u(x, t) := (/ * Pt)(x) on the upper half plane: 

||||^/^)|| W )||lkr)< ||/||l 2 (r). (52) 

We will use this result together with the following corollary for averages. For each A G R 
and each k G Z, define 



A X J(x):= J /(0l[A-2^A +2 -*](0e 27r *X- 
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Then (52) and a classical square function argument show that 

\\\\A X k f(x)\\ V r {Z) \\ L 2 {R) < \\f\\ L 2 {R) , 

with the implicit constant independent of A. 
For simplicity denote 

f*l^x):=A^J{x). 

We proceed by estimating 

IIIIA^ fc /(a:)||^r(z)IUi(R) < IIIIA; fc /i(fc)(a:)|lv2-(z)ll^(R) + IIIIA; fc ft(fc)(a:)||^r(z)ll^(R)> 

where i(k) is such that k^k)-i < k < k^, and gi = — Note that the functions g-i 
are pairwise orthogonal. 

Since the sequence A u)k f i ^) is constant on each block ki < k < k i+ i it follows that 

||||4^«(^)ll^(Z)lk(R) = \\\\fi( X )\\vr(Z)\\L*(R) 

i 

Now, since -Pt(O) = 1 and by using the decay of Pi we get that for each 9 e R 

£iW*)-£*(*)i 2 < E i^ 2fc f+ E iA(^)i 2 

< e i^2 fc f + e ii+02 fc< r 2 . 

Note that G cj fc . implies that 2 fcl |#| < 1 while 9 uo ki implies that 2 ki+1 \9\ > 1, and thus 
we get 

\\\\A^Ji { k)(x)\\yr (z) \\ L 2 (R) < ||/|| L 2 (R) . (53) 

Denote by \ the common endpoint of all the intervals uj k , ki < k < A: i+1 . Finally, (53) 
and the strong nestedness lead us to 

i i 

< imb(R) + ike ii^ i ^^)ii^( fcl < fc < fcl+l) ) 1/2 ii^w 

i 

^ ll/IU 2 (R)- 

This and (53) ends the proof of the lemma. ■ 

Proof of Theorem 8.2 Denote by (fik,i( z ) := ( r ^-*i(. m k,ilvk,i)Y( z ) an< ^ ^y B the best 
constant for which the following inequality holds for each f 1: ... ,f L e L 2 (R) with 
supp(/i) C [-2,2]: 

|| sup \^e 2wiXlZ {f l * ^)WIIUl(R) < B(£ ||/Hli 2( R)) 1/2 - 

It suffices to prove that 

B < L 1/2 - 1/r sup sup IHKm^l^ i1 ?)"(^)||^-(l)||l2 (r) . 
1 IIsIIl2(r) =1 
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For each < y < we have by Plancherel's theorem that 



Wfi - Tfy/i|U 2 (R) < 2 1 1 ^ 1 1 z ' 2 (R-) ' 



and hence we can write 



sup 

fc>0 



X>^*(/l *¥>*,!)(*) I 



1=1 



< 



L1(R) 



snpl^e^Tr,^*^)^)! 



k>0 



1=1 



+ 



L|(R) 



B 



+ ~(y^ II/*IIl 2 (r; 



,1/2 



Thus, by integrating in y, it suffices to prove that 

L 

sup I ^Te^A iV ^ (// # ^ mL2ym)) 



k>0 



1=1 



il(R) 

L 



< 



1/2 



L^-^sup sup \\\\(m k A^mz)\\v k HL)\\ L2(R) (J2 WfiWhin)) 1 ' 

1 ll9ll i 2(R)= 1 1=1 

We can estimate the first term above by first using Lemma 8.4 and then Minkowski's 
inequality on l r / 2 (N) (for arbitrary N) by 

^ 1/r ||||^ z U*^)WII^(L)IUl(R) 



^11^*^(^)11^)) 



1/2 



L 2 (R) 



^L 1 / 2 V'sup sup ||lk*^MWIIvx-(L)|| L2(R) (^||//||| 2( R) 



,V2 



= L^-iAsup sup ||||(m M l WM ?r(z)|| || l1( r)(^ ||/dl! 2( R)) 1/2 , 
1 IIsIIl2 {r) =i k l=1 

where the variational norm in the first term above is understood in the Hilbert space 
l\L). " . 

An argument very similar to the above also proves the following version of Theorem 8.2: 

Theorem 8.6. Consider a collection R of L disjoint dyadic frequency intervals u. For 
each us G R and each k G Z let : R — > C be a sequence of multipliers. Define 



Then for each r > 2 

|| sup |A fc /(x)||| L 2 (R) <L 1/2_1/r sup sup ||||(m fe)a; l^)~'(z)||y, (L) || i 2 (R) ||/|| i 2 (R) , 
k uieR \\g\\ L 2 (R) =i 

with the implicit constants depending only on r. 
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It turns out that the results of Theorems 8.2 and 8.6 are not general enough for our 
applications, and so we prove the following more general version. Consider now an arbi- 
trary set A = {Ai, . . . , Xl} with no further restrictions on it, and for each k G Z define 
Rk as before. We now associate to each uj G [j k Rk a multiplier m w : R — > C and define 

A k f(x) ■.= [ muomy^ x dc, m 

IKJvj'* := sup sup sup || || (m Uk l Uk 9)'{ z )\\v^L)\\i^(R)- 

Theorem 8.7. For each r > 2 we have the inequality 

IIsupIA^IIUk^^lV^Aii^ii^h/i^ 

k 

with the implicit constant depending only on r . 

Proof It suffices as before to assume that the index k runs through a finite interval 
{a, a + 1, . . . ,b} with a,b G Z. We can find a sequence a = ko < k\ < . . . < /cat = b 
with N < L, such that for each < j < N — 1, R k has the same cardinality when 
kj < k < k j+1 . If fj : = (Eoje-R, _ ZLefl* then the functions f j are pairwise 

orthogonal. We can now bound || sup fc | A k f(x)\ ||l|( R ) by 

|| sup sup \( Yl 

J k 3 <k<k ]+l ueRkj+1 f>j 

+ II sup sup |(£ mulufjYix^Wmn). (56) 

j kj<k<k j+1 u( . Rk 

For each cj G Rk J+1 and each kj < k < kj +i , u{k) is defined to be the interval in R k 
containing oj. Theorem 8.2 and scaling invariance show that the term (56) can be bounded 
by 

(£11 sup |(£m w l^r(^)llli S( R)) 1/2 < 

. kj<k<k ]+1 ^ eRk 

~ E Ll ~ 2/r SU P SU P SU P llll( m ^ 1 ^?)l^)||y fc ''(L)||i2 (R) ||/ i ||i2 (R) ) 1/2 

j 1 ^irc+i llfllli2(R)=1 

To estimate the term in (55), define the maximal operators 
0*(h)(x) := ^ sup |( £ mcffejlj)" 

We will argue that 



kj<k<k J+1 we n k 



su p °j( E //)wiUiw^ Ll/2_1/r ii^iivr(Eii^iii 2 (R): 



1/2 
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It suffices to consider only dyadic values of L so we will assume that L = 2 M , for some 
M > 0. For each < m < M, denote by A m the best constant for which the following 
inequality holds for all discrete dyadic intervals J = (ji, J2] := {ji + 1, ji + 2, . . . , J2} 9 Q 
{1, 2 . . . , 2 M } with 2 m elements 

l|su P o;( £ /^(xjii^^An^ll/illi-cR)) 172 - 

r 7 i<i'<i2 jeJ 
We will use a reasoning similar to the one in the proof of the Rademacher-Menshov 
inequality, to argue that Am < Bm, where 

B . = 2m(l/2-l/r)|| I, 

We can write for each < m < M — 1 and each discrete dyadic interval J = (ji , j'2] ^ 
{1, 2, . . . , 2 M } having 2 m+1 elements and midpoint j'3 := j\ + 2 m 



+ 



I|sup0;( fj')(x)\\ 2 LUR) < || sup q( Yl fi'){z)\\h.<R)+ 

jeJ 3<3'<h h+l<3<h 

o;( J] /iOWIkw + ll sup o*( ^ fr){x)\\ Li{ *\ 

i+i<j<h h+i<j<h j3+1 < f < j2 J 



3<j'<33 J1 J ■'• 33+l<3'<32 

We then use the definition of A m for the first two terms above and Theorem 8.6 for the 
third one, to bound the sum above by 

A 2 m £ ll//lli 2( R) + (An( Yl \\U\\»(R)) 1/2 + cB m ( \\fAh m ) 1/2 ) 2 

33+l<j'<32 3i<3'<33 33+l<j'<32 

< (A m + CB m ) 2 Y\\fj\\h(R)- 

3&J 

We conclude that A m+ i < A rn + CB m for each < m < M — 1, which together with the 
fact that A = proves that Am < Bm- ■ 

Remark 8.8. In our later applications of Theorem 8.7 the parameter r will be chosen 
sufficiently close to 2, making the dependency on L of the £ 2 (R) norm of the weighted 
maximal operator negligible. The fact that the £ 2 (R) norm goes to 00 as L gets larger 
follows from the result in [13], where it is proven that this norm is at least of the order of 

(log L)^. 

If the multipliers in the above theorem are chosen to be the constant function 1, 
we recover (modulo a slightly larger L bound) the result of Bourgain from [11], via the 
variational estimates in Lemma 8.5. Bourgain's L bound is (logL) 2 rather then L 1 / 2 ~ 1 / r , 
however our slightly larger bound will suffice for our application, since we will take r as 
close to 2 as we want. We state Bourgain's result for future reference. 

Corollary 8.9. Assume we are in the setting of Theorem 8.7 and that = 1 for each 
uj. For each r > 2 we have the inequality 

||sup|A fe /(x)||| Li(R) ^L^-Vrn/n 



9 Here j 1 = a2 b and j 2 = (a + l)2 b with a,kZ + 



/ 
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with the implicit constant depending only on r. 

An interesting question regards the dependency on L of the L q (R) norm of the operator 
in Theorem 8.7, for q G (1, 2) U (2, oo). The fact that this norm is large as a function of L 
is already apparent at a single scale. Due to the equality = ||?7i||M/ for dual pairs 

(q, q'), it suffices to note the following. 

Proposition 8.10. For each L e N and q e (2, oo) there is a choice of signs (£j)i<z<l 
such that if f L = 1[o,l] then 

1=0 

Proof It immediately follows that 

IKE I / ft(0l[M + il(0e 2 ^X| 2 ) 1/2 ||^(R)-^ 1/2 . 
i=o J 

Khintchine's inequality ends the proof. ■ 

This shows that the L q norm of the maximal operator A*f(x) := sup fc \A k f(x)\, with 
A k defined in (54), satisfies 

HAI^r^r) >L |1/2 - 1/?l |Klk r . 

Theorem 8.7 will be used in Section 10 to control maximal operators. For the proof of 
the oscillation inequality leading to the dense class results, we will need a more general 
version of Theorem 8.7. 

We will assume that L, A/, R k , and A k f are as in Theorem 8.7. Of relevance for 
the estimates in the next theorem is the following multiplier norm 

IWIoun^.* := sup sup sup ||||(m a , fc l a;fc ^)~'(z)||o u nK'-(L)|L rRV 
i \ieu k eR k \\g\\ L2(R) =i 

where U := U\ < u 2 < . . . < Uj is an arbitrary finite sequence of integers. 

Theorem 8.11. Let U := u\ < u 2 < ■ ■ ■ < uj be an arbitrary finite sequence of integers. 
The following inequality holds for each r > 2 

(£ I' SU P \ A *K X ) - A ^/(*)llfe(R)) V2 £ J^L^\\mJ 0v nvr,4f\\L H n), 

j = l Uj<k<U j + 1 

with the implicit constant depending only on r. (in particular it does not depend on either 
J or Mi, . . . ,uj). 

Remark 8.12. The only relevant thing about the exponents and 1 — is that the first 
is less than | and lim r ^ 2 (l — 2 ) = 0. 
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Proof To prove the above theorem we need two inequalities. In the first inequality we 
aim for small L dependent bounds and tolerate a trivial J dependent bound. On the other 
hand, in the second inequality we aim for a J independent bound but we will tolerate a 
big L dependent bound. 

Note that for each j we have as a consequence of Theorem 8.7 

II sup \A k f(x) - A Uj f(x)\\\ L 2 {R) < ||sup|A fe /(x)||| L 2 (R) 

Uj<k<UjJ r i k 

<L 1/2 - 1/r ||rn w || 0o nvr..||/||L"(R). 
Thus, we get our first main inequality by doing rough estimates: 
j-i 

(£|| sup \A k f - A^flWl) 1 / 2 < J^L^-^WmJovnvrWfh- (57) 

Fix now some 1 < j < J — 1. For each interval uj G R Uj pick some I G {1, . . . , L} 
such that \i G uj. Denote by A(j) the set of all these /. For each I G A(j) and each 
Uj < k < Uj+i, denote as before by uj k j the interval in R k containing A;. Define for each 
zeR 

a u (z) := (rn w l w fY(z). 
Since \R Uj \ < \R k \ < \Ruj \ + L when Uj < k < Uj+i, we can evaluate 

sup \A k f(z) - A Uj f(z)\ 2 = sup | a "( z ) - E a ^)! 2 
Uj <k<u J+1 Uj <k<u J+1 wei?fc ueRu _ 

< sup I E ( a ^A z ) - a ^ ,i ( z ) ) 1 2 + L 2 SU P I a ^.i ( z ) 1 2 

Uj<k<u j+l l( _ A ^ k,l 
leA ^Uj<k<u j+1 k,l 

Note also that there is a set V C {1,... J — 1} with at most L elements such that 
\R Uj | 7^ \R Uj+1 1 for j G V, and if j ^ V then we can improve on the bound obtained above 

sup \A k f(z) - A Uj f(z)\ 2 <LJ2 su P K k ,i( z ) ~ a ^A z )\ 2 - 

Uj<k<u j+1 leA ^v,j<k<u j+1 

For each I G A(j) define x\ 3 \z) : = sup u .< fc<u . +l \(a Ukil (z) - a^. ^z)^. By summing over j 
we get 

sup \A k f(z)-A u J(z)\ 2 <Lj2 X)(^' ) W) 2 + ^8UPK W WI 2 

j=l u j< k < u j+i j=i i e A(j) k,t 

J-l 



< ^ E E 1 AO-) (o (*i (*) ) 2 + l3 su p i a ^ (*) i 

2eA j=i 

< ^E'l^wiiou + L3su pi a ^, i ( z )i 

2eA fc,i 

< L 3 SUp |K M (z)||^ ny , (z) . 
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This together with integration with respect to z produces the second main inequality 



'J-l \ V2 

Y, II sup \A k f(x) - A % ./(x)||||, (R) < L 3 / 2 ||m w || 0u n^||/||L 2 (R). 
0=1 u ^ fe <«i+i / (58) 

Finally, by interpolating between (57) and (58) it follows that 

'J-l \ X / 2 

£|| sup |A fe /(x)-A % ./(x)|||i, (R) ^(J^^-^O^^^^IkJIoun^H/IU^R) 



_j Uj<k<Uj + l 



J^L'-^WmJovnvr.WfhHn). 



We also record the following immediate corollary. 

Corollary 8.13. Assume we are in the setting of Theorem 8.11 and that = 1 for each 
uj. For each r > 2 we have the inequality 

sup \A k f(x)-A u J(x)\\\l UR) )^ < J^L^H/IU^h), 

j=1 Uj<k<u j+1 

with the implicit constant depending only on r. 

9. Variational, oscillation and square function estimates 

In this section we prove a few auxiliary results of general interest, which combined with 
Theorem 8.7 will be used later to control the measure of various exceptional sets. The 
following result is classical. 

Proposition 9.1. Let D be a finite collection of dyadic intervals included into some 
interval X, each of which is associated with a function <pi satisfying: 

J 7 (x)dx = 0, (59) 
<pj is C-adapted to I (60) 



If ai G C are such that 



for each dyadic I , then 



-L £ \ ai \ 2 ) 1/2 < B , 

Un — 



/£D 



^ o/0/||bmo(r) %CB, 



ieT> 



II a iM^(ii) < C£?|J| 1/s , 1< s < 00, (61) 
/eD ( , 

wrat/i £/ie implicit constants depending only on s. 
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Proof The BMO estimate follows as in Proposition 9.3, we do not insist on the de- 
tails here. The estimate (61) is an immediate consequence of the first estimate, John 
Nirenberg's inequality and the fact that 

E M/0*0I <Cx!(z), x£2I. (62) 
ier> 

■ 

The next lemma will be used to prove bounds on the variational norms operators. 

Lemma 9.2. Let Do be a finite collection of dyadic intervals I each of which is associated 
with a function <pi satisfying (59) and (60) for a fixed C . Let ( be some fixed Schwartz 
function with l[-i,i] < C < l[-2,2]- Then for each ai e C 

E ii E a ^ - E a ^ * Dil 2* ciii 2( R) < c 2 e ki 2 - 

fcez ieo / e D /eD 

I|>2 fc 

Proof We start by making a few observations. Define &i(x) := (f)j(x + c(/)) and note 
the following consequences of (59) and (60): 

S(o) = o, ||^(0lloo<|/| 3/2 

|S(0I <lell|^(OI<C|el|/| 3/2 (63) 
^(0llmR)<C|7|^ (64) 



1^(01 <^ll^(^(-))IUi(R)^^^ 0<j<2. (65) 

Fix t > 0. The almost orthogonal behavior of the collection <fii^ '■= <fii — <fii * Dil^C, 
with \I\ = 2 k+t , is quantified by the following properties: 

POD 

\{<t>i,kAj,k)\< / |$}(0€(OK<C 2 2^, a consequence of (65), 

J2- k 

\(h,k,h,k)\ = i f C(()C(a(i-((2'()) i >! M|I| "" I ™ ! ii(i 

/ CI/l \ 2 

\\c(J) -c(I)\J 
and hence for each \I\ = \J\ = 2 k+t 

C 2 2~ l 



\(h,k,(f>j,k)\ < - !,,./)_,.(/) ■ 

U H p+t — ) 
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An immediate corollary of this is that 

ii E *iM\h<R) s c>2-< e m 2 - 

ieu ieu 

7 | =2 fe+i |j| = 2 fe+t 

An application of the triangle inequality first and then Minkowski's inequality gives 

\2 



E ii e - E a A(oc(2 fc o)iii»cR) s EE ii E «"MU 2( R)) 2 

feeZ t>0 -feD 

/| = 2 fe + i 

^(E(E 2 " E i^i 2 ) 1/2 ) s 

t>0 fcGl 

<^ 2 Ei^i 2 



fceZ ieD ieD feeZ t>0 -feDo 

|I|>2 fe |I|>2 fe |J| = 2 fe + i 



|/|=2fc+* 



/eDo 



Fix now £ < 0. The almost orthogonal behavior of the collection 0/^ := <pj * Dil^-feC? 
with |/| = 2 k+t , follows as before, by now invoking (63) and (64) instead: 

r 2l ~ k ~ _ 
\(hk,hk)\ < / |$)(0€(0l^<2- fe ||$]|U-(R)||€||L-(R) <C7 2 2*, 
Jo 

~ ~ C 2 2* 

\m,k,(f>j,k) \ < 



2 



(1 + ^^) 2 ' 

We obtain as before 

wt-\ e *Moan))(x)\\% {R] < c 2 e ^ 

ier> /eD 

|/|<2* 



Proposition 9.3. Let D fre a finite collection of dyadic intervals I contained by some 
interval X, each of which is associated with a function (pj satisfying (59) and (60) for a 
fixed C. Consider also a sequence U := (uj)°? = _ 00 of integers. If aj G C are such that 

(-L £ KHV 2 < 5 (66) 



iSD 



/or eac/i dyadic Iq, then 

llll E ^(^llounVJ^IlBMO^R) + || (E I E a ^/| 2 ) 1/2 HBMO(R) < CB, T > 2 



i£D feeZ iSD 

/|>2 fc |/|=2 fe 



llll E a ^llounv fc --(L)IU'(R)+||(El E M/| 2 ) 1/2 |U-(R) £ Cfi|J| 1/s , r>2, Kkoo, 

i6D () feeZ ieDo 

/|>2 fc |/|=2 fc 

wrat/i £/ie implicit constants depending only on s and r. 
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Proof We will only prove the variational estimates, the argument for the oscillation and 
square function inequalities follows a very similar path. It suffices to prove the BMO 
bound. Indeed, this together with John Nirenberg's inequality, trivial estimates of the V r 
norm by the V 1 and V 2 norms and (62) will immediately give the desired L s estimate. 

Consider some arbitrary interval J and define Di = {/ G D : \J\ > \I\, I C 4 J}, 
D 2 = {/ G D : \J\ > |/|, / n (4J) C ^ 0}, D 3 = {/ G D : \J\ < |/|} and bj = 
II 'eo 3 a/0/(c( J)) || v r (L) ■ Define also for i = 1, 2, 3 

|J|>2* k 



H X ) = II Yl a ^^ X )Wv^L)- 



IED { 
\I\>2 k 



It suffices to prove the following BMO estimates for each Fi separately, with the implicit 
constant depending only on r 



(67) 



We start with estimates for Fi and write 

lL(R) 



^il 2 < ll^il' 2 



< 



+ 



+ 



a 7 0/(x) - ^ a/0/ * Dil^ C(x)\\ V r( L ) 

/gDi 



\I\>2 k 



Y ai<t>i * Dilsfc ((x) ~ p 2k(^2 a M(. x )\\vi 



(L) 



ieDi 



\\P 2 k(^2 a I^l){ X )\\v^L) 



/eDi 



(68) 



(69) 



(70) 



Using the result of the previous lemma and estimate (66) we easily bound the term (68) 
by a universal constant multiple of C 2 B 2 \ J\. Then the mean zero of £ — Pi and (61) show 



that 



(69) < 



^M^Dii^C-Pi) 



/GDi 



< II E^Hi^^^Vl, 



L2(R) 



/eDi 



while (52) and (61) imply the same bound for (70). 

The terms corresponding to F 2 and F 3 are estimated trivially. First, for each x G J (60) 
and (66) imply that 

F 2 (x)< \aiMx)\<BC, 

and hence 

r \F 2 \ 2 < B 2 C 2 \J\ 
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On the other hand, for each x G J (60) and (66) imply that 

\Fs(x)-bj\< \ai\\Mx)-Mc(J))\<BC, 
\i\er> 3 

and so again 

- btf < B*C 2 \J\- 

An application of triangle's inequality finishes the proof of (67) and of the proposition. ■ 

Proposition 9.3 and the discussion from the end of the Section 7 implies the following 
fundamental estimates for a 2-quasitree. 

Theorem 9.4. Let U := (uj)JL_ 00 be an arbitrary sequence of integers. For each 2- 
quasitree T with top (It, £t), each I, M > 0, r > 2 and 1 < t < oo 

llll Yl (/ ; ^)0i° T (a;,^T)||o u ny ( r(L)||BMO ;c (R) + ll($^l (f, ¥s)(I)%(x^t)\ 2 ) 1/2 \\bmo x (r) 

|/ s |<2 fc |/|=2* 

< 2" M W(T) 

and 

Ounvjr(L)IU*(R) + IKj^l (/,^)0i° T (a;,CT)| 2 ) 1/2 ||L*(R) 

|is|<2 fc |/|=2* 

<2- M W(T)|/ T | 1 / i , 
tt>^/j i/ie implicit constants depending only on r, t and M. 

10. POINTWISE ESTIMATES OUTSIDE EXCEPTIONAL SETS 

Let V be a finite set of tiles which can be written as a disjoint union of trees T with 
tops T 

V = [J T. 

To quantify better the contribution to various model sums, coming from individual tiles, 
we need to reorganize the collection T in a more suitable way. For each T G define its 
saturation 

G(T) ■= { s e V : u T C u s }. 

For the purpose of organizing G(T) as a collection of disjoint and better spatially localized 
quasitrees we define for each I > and m G Z the quasitree T i m to include all tiles 
s G G(T) satisfying the following requirements: 

• \I s n(2 l I T + 2 l m\I T \)\ > ^ 

• either \I S n (2'/ T + 2<m|/ T |)| ^ ^ or |/ s n (2 l I T + 2'(m - 1)|J T |)| + lJ r- 
Obviously, for each / > the collection consisting of (T; m ) mG z forms a partition of G(T) 
into quasitrees. The top of Tj jTn is formally assigned to be the pair (J T; m , £ T ), where ir, m 
is the interval 2 x + 2'm|ir|) while £ T is the frequency component of the top (It, £t) 
of the tree T (considered as a quasitree). 
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Let T^ m = Tj 1 ^ U T| 2 ^ be the standard decomposition of T^ m , where both T^ and 

Tp^ are formally assigned the same top as T ijm . Denote by an d -T 7 /™ the 

collections of all the quasitrees T^ m , T^ and T^, respectively. 

Consider a, 7 > 0, (5 > 1, r > 2 and the complex numbers a s ,s G P. Let also wi < 
. . . < Mj be an arbitrary finite sequence of integers. The first result in this section is the 
crucial estimate behind Theorem 6.2. 

Theorem 10.1. Assume we are in the settings from above and also that the following 
additional requirement is satisfied 



SU P 771172 — a - ( 71 ) 
sev \is\ 1 

Define the exceptional sets 

E<» := U{* = £ W*) > 02 21 }, 

E(2) ■= U U i x ■ II E m)) MT)\\^(z) > 72- , (|m| + l)- 2 }, 



l,m>0 Te?: (2) »£T 



E(3) -=[j U {*■■ II E «4t(^t)||^ (z) > 72-'}, 



l>0 Tc? r(2) »6T 



where the symbol a(l,m) equals I if m = and I + [log 2 |m|] i/m^0. 
T/ien /or eac/i x £ £W U £( 2 ) U £( 3 ) we /*ave 

||( ^ a a ^(s,0)) fceZ ||A,.,(R) </? 1/2 - 1/r (7 + ^), (72) 

|-T S |<2* 

with the implicit constants depending only on r. 

Proof For each / > and each x G R define inductively 



Fi,x 
V , x 



:= {T G T,x G It} 

{T G F,x G 2'/ T \ 2 Z - 1 / T }, Z > 1 

U G ( T ) 

P^:= |J G(T)\U^,„ i>l 

TeTi, x i'<i 

E X>1 := {c{uj t ) : T G .F, )X }. 

Note that for each x G R, {Vi jX }i>o forms a partition of V. Since x ^ it also follows 
that ttS X) , < /32 2i . 

Fix x G" -E^ 1 -* U-E^ 2 - 1 U-E® and focus on estimates for the left-hand side of (72). For each 
k G Z and / > let f2 fc ,z be the collection of dyadic frequency intervals of length 2~ k which 
contain an element of S Xj /. Let fl k ,i be the collection of all (dyadic) siblings of intervals in 
flk-1,1 that are not themselves in Q k ,i- Observe that both (J fe , and Jl*^ U {J k , <k &k',i 
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are collections of pairwise disjoint intervals which cover { u; Sj2 : s G Vi ;X }. Moreover we 
can write 

^2 a s (f) s (x,6)= ^ E a s (f) s (x,9) 

sePi, x -.\i s \<2 k wen M °£Vi,x 

\l a \<2 k , ^n^ Sj2 ^0 

+ E E u*) E 

Indeed, if l u (0)<f> s (x, 0)^0 for some uj G fi^,/ U Ufc'<fc ^fe',i an d s e ^,x, then this implies 
that oo fl u; Sj2 7^ 0. Moreover, when cj G this latter restriction together with \I S \ < 2 k 
is equivalent with just asking that u C u; Sj2 - Similarly, when u G IJfe'<fc ^fc',J then u; Sj2 C a; 
is impossible, which in turn makes the requirement |/ s | < 2 k superfluous. Indeed £ u 
would imply that u s C u, contradicting the fact that oo s contains an element from S Xj j 
while a; does not. Hence we can rewrite 

a 9 <j> a (x,0) = E E (73) 

+ E E u*) E fl .^.fl)- (74) 

The multiplier in (74) can be written more conveniently as 

( x - E **) IE E u*) E j =(1- E **) E °^M)> 

given the fact that (Uj 6 n fcil J ) c = U k ><k lUn fcV 1 and (U'< fe ^fc',i) fKlW ^fc',i) = 0, 
modulo the endpoints of intervals. The above multiplier operator is the composition of 
two operators. The first one is the identity minus an operator for which Corollary 8.9 
provides good bounds. The second one is associated with the multiplier J2 s€Vl a s (p s (x, 6) 

and hence its L 2 norm will equal 

|| a»0«(M)IUg°(R)- (75) 

seV liX 

We will start by estimating (75), and note that this will implicitly provide a proof of 
inequality (13) (and thus of Carleson-Hunt's Theorem), along the lines of the argument 
in the next section. We leave the details to the interested reader. 

Fix a 9 10 and note that the main contribution to (75) comes from a single tree. More 
precisely, let sg be a maximal element with respect to the ordering of tiles in the collection 

A = {s eV l:X : 9 e u S:2 }. 



10 It suffices to assume 9 is not a dyadic point 
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If T G Ti^ x (with top T) denotes one of the trees such that s e G G(T), then nestedness 
implies that A C G(T). We also recognize as a consequence of the definition of V^ x that 



a s s (x,6»)| = | a s (p s (x,9)\ 
seVi, x seG(T)nr, x 



<| ]T a a a (M)| (76) 
seT ; ( +i,o n ^.- 

s e T i+i,o nP ^ 

+ E M.(M)|, (78) 



seP:|/ s |<|/ T | 
dist(x,I s )>2 ( - 2 |J T | 



where T/ + i is the quasitree obtained from T by using the procedure in the beginning of 
the section, while T/ + i i0 = t|+ 10 U t|+ 10 is the standard decomposition of T/ + i j0 . The 
term (78) is an error term and it is bounded crudely by cr2~ Ml , by using the triangle 
inequality, (46) and (71). 

We next focus on (76). Note that at most one scale in T|^ 10 contributes to the sum- 
mation (76). Thus crude estimates relying on (71) prove that 



a s (f> s (x,9)\ < sup y~] \a s <f> s (x,\ 



c C t (1) r\Vi "' eZ seP: i'»i= 



<<t2 



-Ml 



Before we evaluate the sum corresponding to the 2-quasitree, we make two useful re- 
marks. The first one concerns the fact that there exists n T depending on x and I such 
that 



Ta >0 n^ >se = { a GT l ( J ) 1>0 :2^<|/ - |}. 



The lower bound on the scale is an immediate consequence of the definition of Vi lX - The 
second observation states that if <f>^ (2) (x, 6) ^ for some s G , then \6 — £t; + i 1 < 

S ' T i + l,0 
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We then invoke inequalities (48) and (71) to estimate 



E 



Sfcl i+1,0 
|/ S |>2"T 



^ 1,(2) ( X l 1 
*'-"-i+l,0 



E 

p(2) 



>[ L(2) ( X , 1 
*'-"-i+l,0 



°= 1+1,0 

2™T<|/ 3 |<|e-s Ti+io |-i 



< E KII^U) (^^-/U) (z,6r l+1 , 



sex' 2 ) 

3fc i+1,0 



1+1,0 



+ 



E 



j(0 



seT (2) 
sel i+l,0 

2"T<|/ S |<|e- Sx |-i 



(2) 

1+1,0 



i+1,0 



i + 1,0 > 



<a2-^ + || £ (x,eT I+li0 )||v7(z) 

- - S ' A 1+1,0 



sex' 2 ) 
sfe 1 I+1,0 

/ s |<23 



< ^2" Mi + 7 2 

where in the last inequality we rely on the observation that x E^ 3 \ 
We thus end up having the following estimate for (75) 

II E a sM x ,0)\\ Lr(Il) <a2- Ml + 1 2- 1 . 



(79) 



Finally, triangle inequality in I, an application of Corollary 8.9 and the fact that |S X) j| < 
(32 21 conclude that 



]T(1- !*) E a A(M) 
2>o wesi(., ( se?V 



fcez 



</3 1/2 - 1/r (a + 7 ). (80) 



M* (R) 



We will next turn our attention to the term (73). The multiplier in (73) is of the form 

XL e n M l u{Q)™>w{x,Q), where we define 

mu{x,6) := Y a a <f> a (x,9). 

To estimate the norm M% of this sequence of multipliers we will use Theorem 8.7 with 
ftk,i as the collection Rj,. Fix / and consider a collection of nested intervals Uk G Qk,h 
k G Z. For the remaining part of the proof we will be concerned with obtaining pointwise 
estimates in x for the quantity 



IK? 1 ** E a >M x >-)Y{z)\\vi 



(L) 



(81) 



L§(R) 



which are uniform over all functions (7 with j | flf 1 1 x, 2 (n.) — 1) where the inverse Fourier 
transform of the inner most expression is taken with respect to the variable 6. 
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Fix g. We observe that the collection 



B := {u S)2 : s G Vi, x , uj k C lo s ^ 2 for some fceZ} 

consists of nested intervals. Since this collection is finite, it contains a smallest element, 
corresponding to some s G such that w S0i2 C w Si2 whenever ui s> 2 G £>. Now s G G(T) 
for some T G and hence all the tiles contributing to the term (81) are in G(T). For 
each m G Z, we denote by T; >m the quasitree obtained from T by the procedure described 
in the beginning of the section. Choose some arbitrary c G f\ °°k- This kind of choice for 
c will make possible the estimation of the two error terms below by rather trivial methods. 

The next adjustment has to do with the fact that m u is not constant. We will write it 
as the sum of a main (constant) term and two error terms = mffl + m!$ + m£\ with 

E o s s (x,£ T ) 
E a s ((f) s (x,6) - (j) s (x,c)), 

seG(T)nv l}X 

E a s ((f) s (x,c) - (f> 8 (x,£ T ))- 

sea(T)nv hx 

In dealing with the first error term we get the following sequence of inequalities, uniformly 
in x 

k 

<su P ( J2 \™$(x,9)\Y 2 

6 fc:|w fc |>|0-c| 

<su P ( E ( CT E \i s \ 1/2 \U^o) - u^c)\) 2 Y' 2 

9 fc:|o; fc |>|e-c| se:G(T)nv tx 
kl>KI 

<su P ( e (°- E ic-*iwx?(*)) 2 ) 1/2 

9 fc:|u;fc|>|0-c| «^G(T)nP Ll 
kl>|u, fc | 

<su P ( e hc-^iki-^) 2 ) 172 

9 fc:|w fc |>|9-c| 

< rf""'. (82) 



The passage from the first to the second line above is insured by the trivial inequality 
II ' II v ~ 2 1| • while the passage from the fourth line to the fifth relies on the estimate (38) 
on the 9 derivative of <f> s (x, 6). The passage from the fifth line to the sixth line relies on 
the fact that s G Vi >x implies x £ 2 l ~ 1 I s . 



™£!(M) 



m<2(M) 



rog>(M) 
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To estimate the second error term we invoke Lemma 8.1, Lemma 8.5 and the fact that 
II ' \\v r < II • Hz 1 

IIII(?1^^S(^-))^)IIW)IUIW ^ ||(?W>)||v;(L)|Ul(R) 

< E \ a s(<f>s(x,c) - (f) s (x,£ T ))\ 

sea(T)nv l}X 

< k-e T i E E 

2"<|c-^ T |- 1 1^1=2" 

x<£2 l -~ L I s 

< a2~ Ml . (83) 
The last task is to get estimates for the main term. We decompose each m^l as 



meZ meZ 

where 



™2l!m(M) = E a A(s,6r)- 



Then we estimate 



iiii(?i. fc m«(x,.))^)ii^(L)iUi(R) s E iiii(^^S(^-))^)ii W) iil 2 (r) + 

mez (84) 

+ E llll(^ fc " l &fm(^0)^)|fe(L)||^(R). 

mez (85) 
In analyzing the term (84) we note that for each T, I and m the collection 

C = \J{seT\%nVi^:u k Cu a , 2 } 
k 

contains at most one scale. This is because the collections {oo s ^ '■ s G C} and {cj Sj i : s G C} 
are nested. Also, if I > 1 and s G C then x ^ 2 a ( /,m ) -1 /,,. Thus, for each m G Z and each 
/ > 1, Lemma 8.5 gives 



i(^-))^)|| W) |Ui(R) < sup E \<\M*,£t)\ <m a2~ M ^ m \ 



jeZ s ev.\i s \=v (86) 

!,m)-l r 



m (l,l) 



and by a similar argument, the same works for / = 0, too. 

Next, we consider the term (85). We first acknowledge the fact that for each k,l,m 
there exist n 3 < n 4 such that 

{s G T\% n V hx : u k C c^ 2 } = {s G : 2™ 3 < |7 S | < 2" 4 }- 

The number is independent of k and appears as a restriction due to the fact that at 
level / we only consider tiles that have not been selected at previous stages. The restriction 
| is | < 2™ 4 replaces the restriction Uk Q cj s ,2 and n 4 is increasing as a function of k. This 
observation together with Lemma 8.1, Lemma 8.5 and the fact that x implies 
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^ k ^%(x,-)y(z)\\ V r {L) \\ L 2 {R) < || a s s (a:, ||v^(i) II II (?1^J~(^) || v^(i) IU|(R) 



seT, (2) nv, x 

^k Cuj s,2 

<\\ E *S^ ] \x^)\\ V r {L) 

s6 Tf> 
l,m 

\i s \<v 

< 1 2~ M \\m\ + l)- 2 . (87) 
Thus, summation over m in inequalities (86) and (87) leads to 

||||(?Umi ) 1 )(x,-))-'(^||^(L)||Li(R) < (<T + 7)2-'. (88) 

A final application of the triangle inequality with respect to I in (82), (83) and (88), 
together with Theorem 8.7 and the fact that |S Xj j| < f32 21 conclude to 

IKE E U*) E ^(^^))^z||M^(R)</3 1/2 - 1/r ( ( T + 7). (89) 

By putting together the estimates from (80) and (89) the conclusion of our theorem 
follows. ■ 

We continue with the variant of Theorem 10.1 that will prove useful in the proof of the 
oscillation inequality in Theorem 6.3. To this end, let U := {uj)j =1 be a finite sequence 
of integers and recall the oscillation- variational norm || ■ ||ounv r introduced in (49). 

Theorem 10.2. Assume we are in the settings preceding Theorem 10.1 and also that the 
following additional requirement is satisfied 

\ a s\ 

SU P 771772 - a - 

sev \I S I 1 

Define the exceptional sets 

£ (1) = l> : E w*)>/?2 2 '}, 

E^= |J [j {x:\\J2 ^' m)) (^eT)||o u ny /( z)>72-'(|m| + l)- 2 }, 

l,m>0 TC T(2) se T 

Lkz - t l,m |/s|<2J 

E^ = {J |J {x: || a s ^ T (x,^)\\o v nv f (z)>^ 1 }, 



1>0 Tc t(2) »6T 
■ lt - r I + l,0 |/sl<23 

where the symbol a(l,m) equals I if m = and I + [log 2 |m|] if m ^ 0. Then for each 
x E^UE^UE^ and each g with ||<7||l 2 (r) = 1 we have the uniform pointwise estimate 

sup \^{ «A(x,W)}W|||i SfR) ) 1/2 <J^)9 1 - 2/r (7 + ^), 

. =1 Uj<k<u j+1 s£S 

2 u i<|/ s |<2fe 

wz^/i t/ie implicit constants depending only on r. 
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Proof The proof follows closely the lines of the proof of Theorem 10.1. Fix x ^ U 
E^ U E^ and fix g with ||g||L 2 (R) = 1- We will use the notation introduced in the 
beginning of the proof of Theorem 10.1 and the representation 



^2 a s (j) s (x,6)= ^ l ^ 6 ) E a s (f) s (x,9) 
seV Lx -.\i s \<2 k we^fej ^T hx 

+ (!- E **) E 



(90) 



(91) 



Then, by using the triangle inequality in /, Corollary 8.13, inequality (79) and the fact that 
x E^ we get the following estimate for contribution to the term (91): if ||g||L 2 (R) = 1, 
then 



sup |E^{[(1- E !-)-(!- E E 



E 

< J^p l - 2 ' r (a + 1 ). 
The multiplier in (90) is of the form J2u>en kl ^^(^) m ^( x ^)^ where we define as before 

m u (x,6) = ^ a s<f>s(x,0)- 



To estimate 



E 



sup \^{ lMm u (x, 9)g(9)- ^ l^m^x, 9)g{9)}{, 

Uj <k<u J+1 wgQ 



1/2 



we will use Theorem 8.11 with fi^ as the collection Rk- Fix / > and consider a collection 
of nested intervals oj^ G Qk,i, k G Z. For the remaining part of the proof we will be 
concerned with obtaining pointwise estimates in x for the quantity 



IK? 1 ** E "^foon^iiounw) 



(92) 



where the inverse Fourier transform of the inner most expression is taken with respect to 
the variable 9. Also, given the estimates for the V r norm from the proof of the previous 
theorem, all that is left is getting the corresponding oscillation estimates. Split as before 



(2) , (3) 

mi, + mi, . 



m^ = mi 

The same type of estimates as in (82) lead to the following estimate for the error term 
associated with the multiplier m$ 
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IIIK^^S^OrailoowH^^rf-" 1 . (93) 

(3) 

To estimate the second error term associated with the multiplier m w we proceed like 
in (83). By invoking the second part of Lemma 8.1, Lemma 8.5 and the fact that || • \\o u < 
|| • ||/i we get 

IIIK^mg^OrailooH^ <o7r m . (94) 
The last task is to get estimates for the main term. We split as before 

m (i) = V + V m (1 ' 2) 

"W " t u> k ,m ^ ^k,m> 

meZ meZ 

and estimate 

||ll(^ fc m^ ) (a;,0) , '(^)||o u || L a (R) < Yl \\\\ffi"k m ££l( x r)Y(z)\\o v \\^ {ll ) + 

mez (95) 

+ \\\\(9^M^r)T(z)\\oJ L2z{n) . (96) 

The same discussion as the one regarding the derivation of inequalities (86) and (87) 
shows that 

||ll(?U<'i(^,-))'Wllo u || Li(R) <m a2~ M ^\ (97) 

||ll(?U<l(^r))>)l|o u || ii(R) <72- M (|m| + l)- 2 . (98) 

Thus, summation over m in inequalities (97) and (98) leads to 

\\\\(9Um^(x r )Y(z)\\ J L2z{n) < (a + 7)2"'. (99) 

A final application of the triangle inequality with respect to I in (93), (94) and (99), 
together with Theorem 8.11 and the fact that |S Xj j| < j32 21 concludes to 



j-i 

(E 

j'=i 



sup I^^H lMrnu(x,6)g(6)- E 

Uj <k<u j+1 j> ajef7fe; wen„. ( 



2 

J 1/2 



< j£% 1 " 2/r (a + 7 ). (100) 



By putting together the estimates from (10) and (100), the conclusion of our theorem 
follows. ■ 

We close this section with a square function estimate for the Carleson-Hunt operator 
that will play the decisive role in the proof of Theorem 6.4. The proof does not contain 
any serious new ideas, other than the ones used in the proof of Theorem 10.1 to estimate 
the L°° norm of the Carleson-Hunt operator. 
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Theorem 10.3. Assume we are in the settings preceding Theorem 10.1 and also that the 
following additional requirement is satisfied 



sup 





a 


s 






1/2 



< a. 



Define the exceptional set 

e-=U U {-(Ei E «4 ) t(-^t)i 2 ) 1/2 >72- ? }. 



l>0 T c-r( 2 ) 
x t - r i+i,o 



j'eZ «6T 



Then for each x E we have 

(£| «A(M)l 2 ) 1/a 



fcGZ °ev 



^ + 7- 



(101) 



M 2 , 9 (R) 



Proof 

We will assume again the notation introduced in the beginning of the proof of Theo- 
rem 10.1. Fix x $l E and G R. Note that the main contribution to (101) comes from a 
single tree. More precisely, let sq be a maximal element in the collection 

A = {s g Vi, x : 6 g ^ s , 2 }. 

If T G (with top T) denotes one of the trees such that s e G G(T), then nestedness 
implies that A C G(T). We also recognize as a consequence of the definition of Vi jX that 

(£| £ aA(M)| 2 ) 1/2 = (£l E «A(M)| 2 ) 1/2 



|/sl=2 fc 



fcGZ s€G(T)nP iv 
|/ s |=2* 



<(Ei E «am)i 2 ) 1/2 



kez 



s6T J+l,0 n7 V 
|/ s |=2* 



+ (E i E a ^ 

|/ s |=2 fc 

+ (£< £ 

fceZ s£P:|/ s |<|/ T | 

dist(i,/ s )>2 ; - 2 |/ T 
|/ s |=2* 



(0 

s,T 



(2) 



2U/2 



'1+1,0 



|aA(M)l) 2 ) 1/2 , 



(102) 
(103) 
(104) 



The term (104) is bounded in the same manner as the term (78) by o~2 Ml (uniformly 
in 9). 

We next focus on (102). Note that at most one scale in t|+ 10 contributes to the 



summation (102). Thus estimates like the ones for (76) prove that 



J2\ E m>*m)i 2 ) 1/2 <sup J2 

J sEV:\I s \ = V 

x42 l - 1 I s 

|/ S |=2 fe 



a s <t> s (x,0)\ <<j2 



-Ml 



fcGZ gT (l) np 
sfc i+i,o ' 
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Before we evaluate the sum corresponding to the 2-quasitree, we recall from the proof 
of Theorem 10.1 that there exists n T depending on x and I such that 

T, ( i ) 1 , n^ = {aGT{J ) 1>0 :2" T <|/.|}. 



Also, recall that if (2) (x, 9) ^ for some s E , then \9 — £t ;+1 | < 

S ' T i+l,0 

We then estimate 



(Ei E a .^U M)i 2 ) 1/2 = ( E i E ^' 2 ) 1/2 



1+1,0 



;+i,o '.^ i-i i+i,o 

|/ s |=2* -' tT *+l,0 |/ s |=2fc 

<( E ( E i fl .ii^U M)-<^U (^^ + ,o)i) 2 ) 1/2 

— 1 S1 i+1,0 st i + 1,0 

|/ s |=2 fc 

+ (El E ^'U (^e Ti+1 , )| 2 ) 1/2 <a2-^ + 7 2-', 

fcez seT (2) 
sfcl i+i,o 

I Is 1=2* 

where in the last inequality we have used the fact that x ^ E. 
We thus end up having the following estimate 



C£\ E aM^M 2 ) 1,2 \\M 2 , m <^- Ml + 1 2' 1 . (105) 



fcez se-Pj^ 

|/ s |=2 fc 



Finally, the triangle inequality in I concludes that 



J2\ E ^(^^)i 2 ) 1/2 iiM2, e( R)<Eii(Ei E ^M)i 2 ) 1/2 iiM 2 , e( R) 

l>0 i 

^ + 7- 



fcez =ep z>o fcez «ev tx 

|/ S |=2 fc | Js | =2 fc 



11. Proof of Theorems 6.2, 6.3 and 6.4 

We will present the proof of Theorem 6.2 in detail and then indicate the modifications 
that have to be made in the argument to get Theorems 6.3 and 6.4. Let U = (tij)/=i 
be an arbitrary finite sequence of integers. For each collection of tiles S' C S define the 
following operators, relevant for the three theorems we aim to prove: 

T S 'f(x) := ||( E (/»^)&(M))fcez||M 2 * 9 (R)- 

ses' 

/s|<2 fc 

J-l 

O w f{x):= sup (£|| sup \^{ (f^s)Ux,o)m}(m\h(K)) 1/2 

!Is , IIl2 (r) =i j=1 uj<k<uj +1 aes , 

2 u J<\I s \<2k 

g s ,/(x):=n(Ei E (/,^)^(^^)r) l/2 iiM 2 , e( R). 

fcGZ ses' 

|/sl=2 fe 
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Let V denote any of these operators. Define c(V,p) to equal 1 if V is either T or Q, and 
ji/2-%) when v = O. For each 1 < p < oo, the index is some number in (0, h) 
whose value will become implicit later in the argument, without however being computed 
explicitly. 

There is a common part in the argument for all three operators above, and we will 
describe it in the following. Note that for each S' the operator Vg/ is sublinear as a 
function of /. Also, for each / and x the mapping S' — > V&if(x) is sublinear as a function 
of the tile set S'. We will prove in the following that 

m{x : V s l F (x) > A} < c(V,p) 1 -^, (106) 

for each F C R of finite measure, each A > and each 1 < p < oo. Then, by invoking 
the Marcinkiewicz interpolation theorem and restricted weak type interpolation we get 
for each 1 < p < oo that 

\\Vsf\\ P <d(V,p)\\f\\ p , 

where d(V,p) equals 1 if V is either T or Q, and J 1 / 2- ^) when V — O, for some appro- 
priate e{p) G (0, |) whose value will become implicit later. 

Fix F and A. We first prove (106) in the case A < 1. Define the first exceptional set 

E :={x: M p l F {x) > A} 
and note that \E\ < ^. Split S = S x U S 2 where 

51 := {s G S : I a n E c + 0} 

5 2 := {s G S : I a n E c = 0}. 

Decompose E = \J i Ei as a disjoint union of intervals Ei and define E' :— (J- 2£ , j. Let us 
first show that 

m{x G (£T : Vs a M*) > A} < H (107) 

Split S 2 = U K > 1 S^ where 

S£ := {s G S 2 : 2 K - 1 / S nF = 0, 2% nfi^ 0}- 
Note further that if s G S 2 then 

%M < mf MpM*) < 2* inf M P 1 F (*) < A2 K . 

We next partition S£ := {Ji,i e z S 2' M , wnere S 2'^ : = e : / s C ^, |J S | = 2'} and 
observe that for each x G (-E') c 



2 

M 



^EE E «**)^EE E ^,m(|t) sE^Exi, 
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We next apply the Fefferman- Stein inequality [20] 

||VsgliH|LP((E') c ) ~ A2~ Mk || y^x|JUp(R) ^ \2~ Mk \E\ iIp . 



Now we can write 

m{x e {E'Y : V S2 l F (x) > A} < A^(]T \\V s « f \\ LP{{ET) y < \E\ < S 



K>1 

The rest of the proof in the case A < 1 is devoted to arguing that 

m{x G R : V Sl l F (x) > A 1 -} < c(y,p)H, (108) 

for each e > 0, with the implicit constant depending only on e and p. By combining this 
with the previous estimates, we get that for each 1 < p < oo, 0<A<l,e>0 and each 
FcRof finite measure, there is an exceptional set of measure < A~ P |F| such that for x 
outside this set 

V s l F (x)<c(V,p)X 1 - e . 

Finally, this will easily imply (106) for A < 1, since the range of p is open. The proofs 
of (108) and (106) in the case A > 1 for the operators T s , 0$ and Qs are very similar, 
the only difference appears in the choice of the exceptional set. We start by giving the 
full details for the operator T s , and then briefly indicate the modifications needed for the 
other two operators. 

11.1. The estimates for T s . We start by proving (108). Proposition 7.6 guarantees 
that size(Si) < A, where the size is understood here with respect to the function lp. 
Define A := [— log 2 (size(Si))]. Use the result of Proposition 7.7 to split Si as a disjoint 
union Si = [J n>A V n , where size(P n ) < 2~ n and each V n consists of a family T-p n of trees 
satisfying 

|/t|<2 2 "|F|. (109) 

Let e > be an arbitrary positive number. For each n > A define a := 2~ n , (3 := 2 3n X p , 
7 := 2 - ™/ 2 A 1 / 2 ~ e . Define a s := {\ F ,tp s ) for each s E V n and note that the collection 
V n together with the coefficients (a s ) s <=v n satisfy the requirements of Theorem 10.1. Let 

(2) (2) 

J~v i m b e the collection of all the 2-quasitrees T ; ^ obtained from all the trees T e T Vn by 
the procedure described in the beginning of the previous section. Define the corresponding 
exceptional sets 

^ 1); =l> : E ww>^}, 

E( n ] •■= U U i x ■ II E «.^ ,,m)) (^eT)||v7(z) > 72-'(|m| + I)" 2 }, 



Ei n } -={J U I*'- II E ^tO^II^z) > 72-'}. 



«>0r rc7r (2) seT 
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By (109) and the fact that A < 1 we get 

\E<»\ <2-»A-*|F|. 

By Theorem 9.4 and the fact that A < 1, for each 1 < s < oo we get 

\ E n ] \ ~ l~ s v s ~ 2 \F\ < 2- n ( s / 2 - 2 )A _s(1/2_e) |F|, 
1^(3) | < 7 -V s ~ 2 |F| < 2- n ( s / 2 - 2 )A~ s(1/2_e) |F|. 

Define 

n>A 

Note that since A > log 2 (A _1 ), we have \E*\ < A _P |F|, an estimate which can be seen by 
using a sufficiently large s. 

For each x ^ E*, Theorem 10.1 guarantees that 

IK E ( 1 F, ( Ps)(f>s( x ^))kez\\Mi e {ii) < Y IK E ( 1 F, ( Ps)(f> s ( x ^))kez\\Mi e CR) < 

seSi n >A "ev n 

\Is\<2 k \Is\<2 k 

< Y n[2 (3(r/2_1) ~ 1)n A p(r/2_1) + 2 < -^ r/2 ~ 1) - 1/2)n X p< - r/2 - 1)+1/2 ' e ] < A 1-2 ', 

n>A 

if r is chosen sufficiently close to 2, depending on p and e. This ends the proof of (108), 
and hence the proof of (106) in the case A < 1. 

We next focus on proving (106) in the case A > 1. In the remaining part of the 
discussion the size will be understood with respect to the function A" 1 ^. Proposition 7.6 
implies that size(S) < A -1 . Define A := [— log 2 (size(S))]. Split S as before, as a disjoint 
union S = \J n>A V n , where size(P n ) < 2~ n and each V n consists of a family T Vn of trees 
satisfying 

Y \I T \<2 2n \~ 2 \F\. (110) 

For each n > A define a := 2~ n , (3 := 2^ +1 > l and 7 := 2-"/ 2 . Define also a s : = 
(A _1 1^, tp s ) for each s aV n and note that the collection V n together with the coefficients 

(a s ) s€ -p n satisfy the requirements of Theorem 10.1. Let J~-p im ^ ne collection of all the 

(2) 

2-quasitrees obtained from all the trees T e J--p n by the procedure described in the 
beginning of the previous section. Define the corresponding exceptional sets 

£i 1): =l> : E W*)>/32 2 '}, 

E ( n ] '■= U U i* ■ II E ^' m) \x,CT)\\vr {Z) > l2- l (H + I)" 2 }, 

!,m>0 T(: r(2) seT 
L>z - r V n ,l,m \I S \<23 

E ^ ■= U U i x ■ II E a.^Tfo&OH^z) > 72-'}. 

i>0 Tc ^(2) s£T 
±fc - r T' n ,i+l,() |/ S |<2J 

By (110) and the fact that A > 1 we get 

(1)| < 2 - (p -l)n A -2| F |_ 
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By Theorem 9.4 and the fact that A > 1, for each 1 < s < oo we get 

\E { V\ < 7~V S - 2 A- 2 |F| < 2 -"W2-2) A -2| jP | j 

\E®\ < 7 -V^ 2 A- 2 |F| < 2- n ( s / 2 - 2 )A^ 2 |F|. 

Define 

E* := )J(eVuEVuEV). 

n>A 

Note that since A > log 2 (A), we have \E*\ < A~ P |F|, an estimate which can be seen by 
using a sufficiently large s. 

For each x E*, Theorem 10.1 guarantees that 

IK E ( A_ll ^'^)0s(^^))feez||M* 9 (R) < Y IK E ( A_ll ^'^)0 s (^^))feez||M* 9 (R) 

ses n>A ^eVri 

\U\<l k \Is\<2 k 

< Y n2 (p+1)(r/2 ~ 1)n (2~™ + 2~™ /2 ) < 1, 

n>A 

if r is chosen sufficiently close to 2, depending only on p. This ends the proof of (106) in 
the case A > 1. 

11.2. The estimates for Os- To prove (108) and (106) in the case A > 1 for Os, we use 
the same values for a s , a, (3 and 7 as in the case of T and work with the exceptional sets 

EP:=\J{x-- Y h< lT (x)>(V 2l h 

E( n ] ■= U U i x ■ II E «.^ ,,m)) (^^)||o„nv 7 (z) > 72-'(|m| + l)" 2 }, 



l,m>0 rp c7 r(2) «€T 



E( n ] ■= U U {^11 E ^t^'^IIo^z) > 72~'} 



!>0 Tc)r (2) s£T 
^PnJ + l.O |I„|<23 



11.3. The estimates for Q s . To prove (108) and (106) in the case A > 1 for the operator 
Qs, we use the same values for a s , a, (3 and 7 as in the case of T and define the exceptional 
set 

E -=U U {^(El E ^°T^eT)| 2 ) 1/2 >72-'}. 

I>0 Tcr (2) jeZ sST 

■ lt - r i+l,0 |/ s |=23 
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